HDS Buys Pentaho

Hitachi Data Systems has acquired Pentaho, but what was the company thinking?

Hitachi Data Systems acquired open source analytics specialist Pentaho. HDS itself has almost no presence in BI and data warehousing (DW) -- or the big data analytic space, for that matter -- with the obvious exception of its role as a provider of enterprise storage and compute resources. Which begs a pretty pressing question: just what is HDS thinking?

Enterprise computing giant Hitachi Data Systems (HDS) last week acquired open source software (OSS) business intelligence (BI) and analytics specialist Pentaho.

According to BloombergBusiness, the value of the acquisition, which was not publicly disclosed, is estimated at between $500 and $600 million. For purposes of comparison, TIBCO Software Inc. last year paid $185 million for JasperSoft Inc., one of Pentaho's OSS rivals. (The $195 million that Tibco spent for data visualization best-of-breed Spotfire Inc. back in 2007 looks like a bargain now.) 

HDS itself has almost no presence in BI and data warehousing (DW) -- or the big data analytic space, for that matter -- with the obvious exception of its role as a provider of enterprise storage and compute resources. All of which begs an important question: what was HDS thinking?

In a prepared statement, HDS' Kevin Eggleston invoked the nostrums of big data and the Internet of things (IoT). “Data remains an untapped resource for many organizations and businesses -- with the realization of the value of that data remaining a challenge,” said Eggleston, senior vice president of social innovation and global industries with HDS. “The combination of Hitachi's broad industry expertise, advanced information technologies, and now Pentaho software and the talented team of experts, will enable us to give customers a more complete solution to manage their data -- allowing them to leverage the power of big data and Internet of [t]hings in a quicker and simpler way.”

There are important questions about just what Hitachi is getting -- free and clear, at least.

Pentaho, like rival OSS player JasperSoft Inc., built its Pentaho Business Analytics Enterprise Edition (EE) atop an OSS technology stack, innovating on top of projects such as Mondrian (an OSS OLAP engine), Kettle (OSS ETL technology), and Weka (OSS data mining technology), among others. These projects use a mix of OSS licenses: Weka, for example, uses the GNU General Public License (GPL) versions 2.0 and 3.0; Mondrian, on the other hand, is distributed under the Eclipse Public License.

What's more, Pentaho offers a Community Edition (CE) version of Pentaho Business Analytics that's available under GPL version 2.0 (GPLv2), the GNU Lesser GPL (LGPL) version 2.0, and the Mozilla Public License (MPL) 1.1. That version is free for commercial usage -- albeit without support. It doesn't include custom code bits that are distributed with Pentaho Business Analytics EE.

The traditional approach to marketing open source software commercially involves selling an enterprise version of an OSS stack that includes custom code along with maintenance and support. Both JasperSoft and Pentaho started out doing just this. Both largely abandoned this approach, however. Jaspersoft, for its part, focused on developing a developer-oriented BI platform-as-a-service (PaaS) offering; Pentaho, on developing a best-of-breed BI analytic platform. Both continued to incorporate OSS projects or code; both were greater use of custom code bits, too.

The question, then, isn't if HDS is getting anything; rather, it's what, exactly, is HDS getting?

In other words, what custom code bits is HDS getting? What custom code bits can HDS be getting? The GPL v2, for example, permits a user to make changes to a project's source code, but requires that a user in turn distribute these changes, too. (The LGPL does not impose this requirement.) As a prominent research analyst who's familiar with Pentaho's stack told BI This Week, there's no question Pentaho has generated its share of custom code. It has valuable IP, but there are doubtless cases in which that IP is encumbered -- i.e., can't non-trivially be disentangled from IP that Pentaho (or HDS) doesn't own free and clear. Given its OSS underpinnings, there's a lot of IP Pentaho doesn't own free and clear. “The problem in cases like this is contamination. Non-employee-committed code means that others own the rights to bits of software,” this person said.

On the other hand, this isn't a new or insoluble problem. OSS companies are purchased all the time. What makes it so interesting in HDS' acquisition of Pentaho, this research analyst said, is the extremely high price tag: HDS paid anywhere from $500 to $600 million. What does HDS think it's getting for all of that money? The answer seems to be: nobody knows.

As one analyst-blogger contacted by BI This Week put it: “[It's the] craziest [censored] thing.”

On the surface, the acquisition invites comparison with Cisco Systems Inc.'s 2013 purchase of the former Composite Software Corp., but Cisco's thinking in that case was at least discernible, if not screamingly obvious. Composite marketed data virtualization (DV) software that abstracts physical resources (i.e., data managed by physical database systems residing in physical storage) and minimizes data movement. Even then, DV logic was starting to shift into the network layer; to the extent that network routing and switching hardware is engineered to incorporate this logic -- e.g., via intelligence about the location (local or remote) of data, or even by caching remote data locally at the router or switch level -- DV and networking, Cisco's métier, can be said to complement one another.

Cisco also explicitly connected the acquisition of Composite with its Unified Computing System (UCS) initiative. HDS' rationale in acquiring Pentaho seems less discernible. True, HDS isn't just getting Pentaho's IP, however valuable that might be; it's also getting Pentaho's human expertise: the developers, modelers, architects, and integrators who design, build, run, and optimize Pentaho's Business Analytic platform. HDS itself has been reselling Pentaho's products for more than a year, which suggests that it, too, has developed a Pentaho competency of its own.

Has it developed half a billion dollars' worth of Pentaho competency?

The acquisition of Pentaho continues a recent streak of BI-related acquisitions. Last May, Tibco acquired JasperSoft; in early December, OpenText acquired BI reporting stalwart Actuate Corp. (for approximately $330 million -- an 89 percent premium over Actuate's stock price); in January, meanwhile, Microsoft picked up OSS R specialist Revolution Analytics.

About the Author

Stephen Swoyer is a technology writer with 20 years of experience. His writing has focused on business intelligence, data warehousing, and analytics for almost 15 years. Swoyer has an abiding interest in tech, but he’s particularly intrigued by the thorny people and process problems technology vendors never, ever want to talk about. You can contact him at

TDWI Membership

Get immediate access to training discounts, video library, BI Teams, Skills, Budget Report, and more

Individual, Student, & Team memberships available.