Is There a Best Discovery Platform?
Which is the best discovery platform? According to several prominent analytic database vendors, it doesn't matter. You're asking the wrong question.
- By Stephen Swoyer
- October 8, 2012
Which is the best discovery platform?
In spite of what you're likely to hear from enthusiasts of QlikView, Tableau, Spotfire, and other self-described "discovery" vendors, the question doesn't have a single-purpose answer.
According to vendors such as ParAccel Inc. and Teradata Inc., it might not even be the best or most appropriate question. If you're running QlikView, Tableau, Spotfire or any other discovery tool against a sub-par analytic engine, these players maintain, you're missing out: your voyage of discovery is going to founder within sight of shore.
At TDWI's recent World Conference in San Diego, for example, both ParAccel and Teradata, independently of one another, positioned themselves as discovery platforms.
Teradata, for example, trumpeted the discovery capabilities of its Aster analytic platform, which it acquired (last year) when it purchased the former Aster Data Systems Inc.
"There's almost this need for what I'm calling a 'discovery' platform," Chris Twogood, product marketing manager for Teradata platforms, told BI This Week.
He distinguished between Teradata's positioning with respect to Aster and its traditional focus on Active Data Warehousing, which emphasizes mixed workloads and high concurrency in the context of a centralized enterprise data warehouse. "At one point, Teradata was a real discovery platform, but -- and this hasn't happened everywhere -- the data warehouse, because it has become so mission-critical, IT is starting to lock it down a little bit [to ensure] governance and make sure the quality of the data is there."
In other words, Twogood said, Teradata is "not really being used by people who just want to discover. You almost need a separate platform that enables you to just load data, do root cause analysis, [and] have it be that sandbox for discovering things."
This is precisely how Twogood and Teradata position Aster. "That's a great role that Teradata Aster plays. Within a Teradata Aster environment, you're not going to run thousands of applications and thousands of users, but as a discovery platform, [Aster] offers you the ability to do SQL, do MapReduce, [and] do Java. It provides the framework to do that kind of core discovery … we have [existing Teradata customers] adding Teradata Aster for that purpose."
ParAccel Doubles Down
ParAccel, on the other hand, has been talking about discovery analytics for some time now. Whereas Teradata's discovery ambitions are -- for the moment, anyway -- confined to Aster, ParAccel is doubling down: discovery, its representatives maintain, is ParAccel's thing.
"There's a role that we play, and that role is that we're the analytic engine, the analytic grunt; we're trying to create an architecture where both data and process can migrate to the most important platform. For discovery -- for complex ad hoc SQL or analytic workloads of that nature -- that's us," Rick Glick, ParAccel's newly-minted vice president of products and technology, told us in San Diego.
At TDWI's World Conference in Chicago, vice president of solutions and product marketing John Santaferraro explained how ParAccel embeds more than 500 analytic functions in its columnar database engine. "We embed analytic functions … in a library approach, so that the analyst can very easily use a SQL call and run some of those analytic functions that sit in the library," he said.
By analyst, Santaferraro says, he doesn't mean "data scientist."
"That approach makes it very easy for other business analysts and SQL users to run those functions without having to be data scientists. They can use what's already in the library, so a MicroStrategy user or a[n SAP] BusinessObjects user could actually begin using more analytics than they were used to using with just those platforms."
Is this all it takes? Just embed scores or hundreds of analytic functions and proclaim yourself a "superior" analytic platform? No. The issues most likely to bedevil users of so-called discovery tools -- from traditional tools such as QlikView to newer offerings, such as those marketed by SAP BusinessObjects (Visual Intelligence), IBM Cognos (Insight), MicroStrategy Inc. (Visual Insight), and Microsoft Corp. (PowerView) -- are the same issues that have long bedeviled users of traditional business intelligence (BI) tools.
In other words, connectivity issues, or, more properly, getting access to data in a format that's consumable by and intelligible to a particular discovery or BI tool.
ParAccel, however, claims to address this very issue. It says it's developed several "On-Demand Integration" modules (ODI) that offer native connectivity to Hadoop, Teradata, and ODBC. The idea is that the query (or type of query) itself triggers the ODI, which handles the access.
In other words, explains Santaferraro, "a user [can] go out and get additional modules at the point [where] they're running the query. They can get data from a database, they can pull in information from Hadoop, they can call Hadoop from a running job."
The invocation of an ODI can (in specific contexts) be coupled with user-defined functions (UDFs); this permits a data management team to (for example) pre-program specific kinds of transformations or call analytic functions in connection with certain kinds of ODI tasks.
Teradata's answer to this problem comes by way of its Unity effort, says Twogood. He invokes industry analyst Wayne Eckerson's "analytical ecosystem" and Gartner analyst's Mark Beyer's "logical data warehouse" to describe how Teradata is approaching this problem with Unity, which (in this case) can be used to manage queries and access across multiple platforms.
Both Eckerson and Beyer, like their colleague Shawn Rogers, who espouses what he calls the "hybrid data ecosystem", describe a scheme in which an abstraction layer (typically, data virtualization technology) is interposed between data sources and BI or discovery client tools.
"What Teradata offers with our analytical ecosystem and some of the software we've introduced around Unity is really one of the first vendors that's furthest along in realizing this idea of the logical data warehouse," he argues. "Different things ... require workload-specific functions. Whether you're doing stuff as a discovery platform, or MapReduce [on giant data sets] ... you need to be able to have the capability to have different types of technologies that work in combination with one another."