The Agony and the Ecstasy of Analytics in the Cloud
        
        It can be fiendishly hard to translate a BI and analytic practice to the cloud. It can be no easier to spin up a new BI and analytic practice.
        
			- By Stephen Swoyer
- July 1, 2014
It can be fiendishly hard for an organization to translate its business intelligence (BI) and analytic practice lock, stock, and still-smoking data warehouse system to the cloud. It can be no easier to spin up a newly-fledged BI and analytic practice in a cloud context.  That's what software-as-a-service (SaaS) veteran GoodData Inc. discovered last year. 
With the recent release of its Open Analytic Platform -- a platform-as-a-service (PaaS) offering -- GoodData claims to have learned from its mistakes. The truth, suggests Jeff Morris, vice president of marketing for GoodData, is that for most companies, analytics is intensely personal.
"Analytics is personal for everybody. I just don't think you can realistically expect to take a [pre-fab] model and apply it [without modification] to what you're doing in your business," he observes.
Until recently, GoodData tried marketing application-specific analytic SaaS. As ideas go, Morris says, this one sounded great in the abstract: customers could spin up analyic practices for specific business processes or specific business subject areas in the cloud. 
In practice, it didn't work out so well, he concedes. 
"When we were doing the app-specific [offering], we had lots of interest from line-of-business customers. We found that we were doing a poor job of managing expectations: almost every [customer] required some kind of customization or [professional] services of some kind. Maybe their sales process is weird -- and whose sales process isn't weird? They're all customized -- or they do all kinds of services on the back end," he concedes. "As a result, the deployments take longer, the customer was less satisfied with the result, their expectations weren't being met. It wasn't working."
Morris says that GoodData derived a valuable lesson from this experience: change isn't just an inevitable result of doing business, it has a disproportionate impact with respect to what a business does with its analytical practices. Even the simplest of business changes -- the hiring of a new sales person -- can radically change an analytic practice. Bigger changes, such as the hiring of a new sales manager, can have even more disruptive effects. 
"Every time a sales manager comes onboard to an organization, they change the sales process. They change the qualification criteria, they change the sales pipeline stages, and so on. There's not a single sales manager I've seen who doesn't do things like that," Morris explains. "There's all of the age-old things about business intelligence projects failing. I suspect that one of the reasons why they fail is that you changed people out, which in turn changes all of the KPI criteria, which in turn changes all of the data integration criteria," he continues. "For many [people], the anlaytics [they consume] are so highly personal: it's super-charged. It boils down to what KPI do I watch? When I'm doing my GoodData marketing analytics, what I'm watching is different from my predecessor. BI is a really personal exercise -- it's going to change on the whim of almost every consumer."
Platform Play
This March, GoodData announced its Open Analytics Platform, which it markets as a governed BI, analytic, and visual discovery PaaS offering. GoodData's Open Analytics Platform boasts pre-configured connectivity to a number of cloud sources, including mainstays such as Salesforce.com, NetSuite, Marketo, and Amazon Web Services, along with the ability to get at on-premises systems via JDBC and JMS. In its cloud back-end, it makes of an array of named and unnamed storage engines, including the Vertica massively parallel processing (MPP) columnar database from Hewlett-Packard Co. and a Hadoop-based data store for MapReduce ETL. This is what's known, but Morris and other officials are relatively tight-lipped about what else, exactly, helps power GoodData's Open Analytics Platform.
For example, GoodData's "Platform Requirements Guide" makes explicit reference to the Cassandra distributed database as well as to a "Distributed File System" that manages application state information, while ever-vigilant industry watcher Curt Monash speculates that the GoodData platform taps several other open source software (OSS) projects such as Spark (a clustered analytics platform that runs on top of Hadoop) and Shark (a distributed SQL query engine for Spark) in some way.
By the way, you don't normally think of doing ETL in a cloud context -- in fact, many cloud vendors might stake their marketing on the fact that they don't do ETL -- but GoodData actually makes ETL a focal point of its platform pitch. For example, developers can use its CloudConnect studio to integrate data from external cloud sources -- fetched via REST APIs -- and to transform it to be ingested by the GoodData platform. ETL is performed in Hadoop, and another component in GoodData's Open Analytics Architecture, its SLINode services, performs data normalization.
How is what GoodData's undertaking with its Open Analytics Platform different from what it tried to do in the past, with its first foray into cloud analytics? Morris says it's a function of having a scalable PaaS platform, a business-driven, phased, iterative focus, and -- for lack of a better word -- experience: PaaS analytic efforts are almost always going to involve significant customization. It's the nature of doing analytics, and the Open Analytics Platform was designed to accommodate that customization, he argues. 
"We said, let's figure out what's the core analytic that you want to enjoy as your first one [i.e., project], that allows us to start defining the end-user visualization, which is going to involve a line-of-business stakeholder, here's what you're going to see at the end, we like to start with that first, and that to most people is something tangible," he comments.
"Once we understand the problem, you can gather up where the data comes from. We have our own professional services group that helps architect out the system to sort of work with the GoodData way," Morris concludes. "The initial implementation is ... [figuring out] what your dashboard is going to look like at the end, what's the logical data model that I need to drive that dashboard, and that's going to drive the KPIs. You take it in stages, identify tangible goals, and evolve it that way."