Revolution Analytics' Future with Microsoft
Microsoft has acquired Revolution Analytics, which makes enterprise-grade products for the open source R statistical analysis and programming environment. It isn't Microsoft's first foray into open source -- although, from a BI standpoint, it's arguably its most exciting. Just what does Revolution Analytics bring to Microsoft, and what can Microsoft do for Revolution?
- By Stephen Swoyer
- January 28, 2015
First they came for Hadoop. Now they're coming for R.
“They” being Microsoft Corp., which earlier this week announced it had acquired R-focused vendor Revolution Analytics.
R is a popular open source software (OSS) programming environment for statistical analysis. It's roughly comparable to statistical analysis software developed by SAS and SPSS. R is extensively embedded in enterprise BI and analytic technologies.
Vendors such as Information Builders Inc., MicroStrategy Corp., Oracle Corp., SAP AG, Tableau Software Corp., and Teradata Corp. either embed R, connect to it, or (in the case of Teradata) can run it natively in the context of a DBMS engine. Analytic databases such as Vector and Matrix (both marketed by Actian), Kognitio, IBM Netezza, and Teradata Aster also support R.
Elsewhere, it's possible to use R's R Studio development environment to access and analyze data stored in Microsoft's SQL Server database, but that's about it. Microsoft hasn't hitherto touted much in the way of R support, at least in its traditional -- i.e., on-premises -- SQL Server-based business intelligence (BI) stack. Redmond does market R as part of its Azure cloud services portfolio -- viz., Microsoft Azure Machine Learning (ML) -- along with several other OSS technologies, including Hadoop and Redis. (Microsoft has a creditable Hadoop-on-Windows strategy, thanks to its partnership with OSS Hadoop specialist Hortonworks Inc.)
In its traditional BI and data warehouse (DW) program efforts, however, Microsoft has mostly ignored R -- even as some of its partners (such as Predixion Software Inc.) have done otherwise. Truth is, SQL Server is itself a creditable platform for statistical analysis and data mining.
Primitive data mining features first debuted with SQL Server OLAP Services in SQL Server 7, way back in 1998. For SQL Server 2000, Microsoft rebranded OLAP Services as SQL Server Analysis Services (SSAS). Since then, Microsoft has augmented SSAS with each new SQL Server release. Some vendors -- such as Predixion -- even built up their businesses by leveraging SQL Server and SSAS. (Nowadays, Predixion supports several analytic engines in addition to SQL Server. What's more, Predixion supports R itself: users can build predictive models in Predixion's Insight software and run them in R, or vice versa.)
Why did Microsoft acquire Revolution Analytics, a prominent commercial R vendor?
Could it have something to do with R's pervasiveness? With its claimed base of some two million users, a not-insignificant proportion of which is using R in enterprise contexts? Its integration with strictly-SQL analytic database platforms from Actian, IBM Netezza, Oracle, SAP, and Teradata -- and with NoSQL platforms such as Hadoop? (ParallelR and RHadoop are two popular projects that parallelize R in Hadoop.) Might it have something to do with Revolution Analytics itself, which claims dozens of enterprise customers, has deep R coding expertise, and -- not least -- has also notched deep partnerships with prominent players such as Cloudera Inc. and Teradata, among others? The answer to these and other questions might well be: All of the above.
“Revolution Analytics provides an enterprise-class platform for the development and deployment of R-based analytic solutions that can scale across large data warehouses and Hadoop systems, and can integrate with enterprise systems,” wrote Joseph Sirosh, corporate vice president with Microsoft Machine Learning in a blog post. “Its Revolution R product line ... help[s] people and companies realize the potential of big data using sound statistical, scientific methodologies. Top customers include some of the world’s largest banks and financial services organizations, pharmaceutical companies, consulting services organizations, manufacturing and technology companies.”
Sirosh also cited R's pervasiveness (e.g., its two million claimed users) and its Hadoop bona-fides. He noted that Revolution Analytics contributed to OSS R development and claimed that Microsoft will continue this practice. “We will continue to support and evolve both open source and commercial distributions of Revolution R across multiple operating systems,” he wrote, citing Microsoft's support for OSS via Microsoft Azure and its so-called “Microsoft Open Technologies” effort.
David Smith, Revolution Analytics' chief community spokesperson, struck similar themes in a separate blog posting. Smith, too, trumpeted Microsoft's turnabout with respect to OSS -- he claimed Microsoft sponsors more than 1,600 projects on its own CodePlex site as well as on the popular GitHub version control repository -- and talked up Microsoft's ability to take R wide.
“We’re excited the work we’ve done with Revolution R will come to a wider audience through Microsoft. Our combined teams will be able to help more users use advanced analytics within Microsoft data platform solutions, both on-premises and in the cloud with Microsoft Azure. [J]ust as [important], the big-company resources of Microsoft will allow us to invest even more in the R Project and the Revolution R products,” he wrote. “We will continue to sponsor local R user groups and R events, and expand our support for community initiatives. We’ll also have more resources behind our open-source R projects including RHadoop, DeployR and the Reproducible R Toolkit.”
Smith, too, stressed that the post-acquisition Revolution Analytics would “continue to enhance open source R.” For users and customers, he said, everything should be business as usual.
“[N]othing much will change with the acquisition. We’ll continue to support and develop the Revolution R family of products -- including non-Windows platforms like Mac and Linux,” he wrote.
Research analyst Mark Madsen, a principal with Third Nature Inc., thinks the acquisition is a win-win for both Microsoft and Revolution. “It's hard to upsell from freely available R to a higher performance, easier to manage R. You're fighting the expectations that are set when a company starts by selling anything open source. It's hard to justify improved performance as the basis for paying license and support fees,” he comments.
Madsen explains that Revolution had successfully managed to do just this, chiefly by augmenting OSS R packages with enterprise-friendly amenities, such as an enhanced visual design environment and visual-interactive user interface, in addition to training, consulting, support, and maintenance.
“On the Microsoft front, I think there's a match. R is possibly the most popular software in that market. Microsoft's core strength is in selling products to a large audience. Combine Revolution R with SQL Server and there's a potentially large market,” Madsen adds.
As for Smith's and Microsoft's stated goal of taking R to a “wider audience,” Madsen isn't so sanguine. On the one hand, he says, Microsoft with SSAS helped to demystify -- and, to some extent, to commoditize -- advanced analytic functions and data mining.
“Microsoft can reach that [enterprise] market far better than Revolution could on their own, and R gives you a lot more functionality than the data mining software Microsoft has been selling,” he points out. “However, I doubt there is a large number of R users in any organization relative to, say, BI users. For mainstream users, however, the [Microsoft]-ification of R means legitimacy. Not that these people know how to or are interested in using R, but Microsoft makes things like that accessible.”
He's skeptical, however, that Microsoft (or any other vendor) can put something like an R statistical workbench on every business desktop. It isn't just that the technology is hard to grasp, it's that rank-and-file users aren't clamoring for it in the first place. “I don't know how business-friendly you can make a technical interface for statisticians. That seems silly. It's better to take the output of the models and feed it into things that are friendly,” he concludes, arguing that the idea of “data mining for the masses” is a pipe dream. It might be possible to develop something like data mining for the masses -- of statistically-savvy analysts and developers, he argues. In a sense, that's just what R did.
“The mistake vendors keep making is thinking that users want to use these tools. Users want the output of these tools, not the ability to create models, or rather some do, but they are not competent to do so and making easy-to-use tools won't make the process any easier.”