Connectivity: Open Source or Enterprise?
There's still a teeming market for best-of-breed data-source and application connectivity software. Business intelligence (BI) and analytics use cases, which emphasize ODBC and JDBC, are as hot as ever -- but so are RESTful APIs.
- By Steve Swoyer
- May 11, 2016
Late last year, Teradata Corp. announced the availability of free ODBC and JDBC adapters for Presto, a SQL query engine for Hadoop. Surprisingly, Teradata didn't take the obvious route -- tweaking and optimizing code from the open source software (OSS) community -- to develop its Presto adapters. It also didn't use its own in-house ODBC and JDBC intellectual property.
Instead, it licensed ODBC and JDBC connectors from Simba Technologies Inc., one of a handful of companies that provide best-of-breed data-source and application connectivity software.
Simba and Progress Software Corp. (which markets the Data Direct line of connectivity adapters) are two of the bigger players in a market that still teems with commercial vendors. Other players include Actual Technologies LLC, EasySoft Ltd., and OpenLink.
Teradata's director of technical product marketing, Imad Birouty, said there's a simple reason his company chose to license commercial ODBC and JDBC adapters instead of developing adapters from existing OSS ones. "We received a lot of requests from Netflix and other customers [saying] 'We need the ODBC drivers sooner [than promised by the initial road map].'"
He wouldn't comment on any other possible reason why Teradata opted not to use (or to optimize) OSS adapters. A research analyst who's familiar with some of the available OSS offerings has a theory, however.
"Most of the available [OSS] connectors are ... not so great," said this analyst, who argued that although the adapters themselves do have shortcomings, problems with OSS connectors are also a function of the relentless pace of development in the OSS world.
Major changes to critical top-level projects -- such as when the Spark cluster computing framework abruptly abandoned "Shark" (a Hive-dependent SQL engine for Spark) in favor of a brand-new technology, Spark SQL -- can break downstream support. Such major changes are more common in OSS development than in enterprise commercial software, this analyst said.
Not surprisingly, Dion Picco, general manager of data connectivity and integration with Progress Software, also has some thoughts on this. He agrees that significant changes to top-level projects are a big problem for OSS ODBC and JDBC adapters. Additional problems with adapters, he says, include incomplete compatibility with ANSI SQL standards and inconsistent interoperability with tools, whether they are commercial BI/analytics offerings or top-level OSS projects.
"There's a relentless pursuit of innovation [in the OSS space]. There are things [such as] consistency and backward-compatibility, those kinds of qualities sometimes take a back seat to [innovation]," he argues.
"One thing we've done is we've pledged [to offer] day-one support for certain [OSS] projects. If a new version of Hive comes out and our customers upgrade on day one, if they find an issue, we'll fix it as soon as possible. We've had customers decide to leave [i.e., stop using Data Direct's adapters] in favor of open source technologies. Eventually, we usually see them come back to us. They're just fed up with having to pay for services engagements when something goes wrong. They're fed up with security holes."
Picco says most ODBC and JDBC drivers use SSL to support encrypted connections. The person, group, or entity charged with maintaining a particular OSS ODBC or JDBC package may be diligent about slipping in new versions of OpenSSL when vulnerabilities are discovered and patched. On the other hand, they might not be, he claims. "When was the last time [those libraries were] updated to the newest version? Does the new [SSL] version break compatibility with the driver? It's stuff like that."
The OSS world can be very particular, and in some cases prickly, about the provenance and/or purity of code. At the very least, there's a preference for unencumbered, non-IP-restricted code.
This is another reason Teradata's decision to license commercial ODBC and JDBC drivers was surprising. With this deal, customers get the adapters for free but they don't get the source code. In some contexts, this would be a significant negative. In this case, both Birouty and the analyst quoted above don't think it will be a problem: enterprise IT organizations are not OSS purists, they point out.
Similarly, a person familiar with Teradata's decision making noted, "There hasn't been a discussion of that on our side. We didn't think [unencumbered source code] was all that important: it's just a driver. Does anybody care where the driver comes from? Say some guy out there already wrote a JDBC driver. Why not use that? Is it robust? Is it reliable? Is it secure? The last thing a business needs is they're writing their reports and they have a problem [with the driver]. We just said, 'Let's go enterprise-class. We've partnered with Simba for decades. Let's use them.'"
Different APIs for Different Tasks
Picco claims that sales of ODBC, JDBC, and other adapters to enterprise customers comprise only a small portion of Progress Software's revenue. OEM licenses account for the lion's share of sales.
At the same time, he argues, the traditional OEM market for adapters that support stateful, direct connectivity between and among client and server (such as ODBC and JDBC) has to some extent been challenged by adapters that support RESTful, state-less connectivity -- as with software-as-a-service (SaaS) offerings such as Salesforce.com.
For many kinds of app-development projects, particularly efforts to integrate or connect Salesforce apps in the cloud with (new or existing) on-premises enterprise apps, this makes sense.
It most emphatically does not make sense for the BI and analytics use case, Picco argues. "APIs are hot, but I can tell you right now that no two APIs are created equal -- and that not every API, no matter how good [i.e., rich] it is, is ideal for every use case. A lot of people got their start with enterprise [application] integration around Salesforce, and Salesforce makes a great API. It performs very well. It's very reliable. So a lot of people think, 'If I could do all of this [with Salesforce's native APIs], I could do other stuff with it, such as [BI] reporting and analytics,'" he says.
This is not the case, however. "The public reference here is Netsuite. They have a pretty good API, too, but they built a BI connector with our technology. What they're doing is they're giving their customers ODBC and JDBC access [to their data in NetSuite] using our technology. When you want to give people BI[-type] access, ODBC and JDBC are better options," Picco says.
Progress was among the exhibitors at TDWI's recent conference in Las Vegas. Direct enterprise sales account for a relatively small percentage of overall revenues, Picco reiterates; instead, he claims, Progress was exhibiting primarily to raise awareness.
Awareness about what? Take the scope of NoSQL and big data: it's about much more than just Hadoop, Picco points out. "The thing that gets missed here at TDWI and at similar conferences is the scope of it all. When we think big data and NoSQL, here everybody is very focused on Hadoop. From our own experience, we're seeing a ton of Mongo[DB], a ton of Cassandra, a ton of Spark SQL, a ton of [Amazon] RedShift. We have customers using Cassandra who are pumping in billions of data points a day. That's big data."
About the Author
Stephen Swoyer is a technology writer with 20 years of experience. His writing has focused on business intelligence, data warehousing, and analytics for almost 15 years. Swoyer has an abiding interest in tech, but he’s particularly intrigued by the thorny people and process problems technology vendors never, ever want to talk about. You can contact him at [email protected].