TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

TDWI Blog

Philip Russom, Ph.D., is senior director of TDWI Research for data management and is a well-known figure in data warehousing, integration, and quality, having published over 550 research reports, magazine articles, opinion columns, and speeches over a 20-year period. Before joining TDWI in 2005, Russom was an industry analyst covering data management at Forrester Research and Giga Information Group. He also ran his own business as an independent industry analyst and consultant, was a contributing editor with leading IT magazines, and a product manager at database vendors. His Ph.D. is from Yale. You can reach him by email ([email protected]), on Twitter (twitter.com/prussom), and on LinkedIn (linkedin.com/in/philiprussom).

TDWI Blog: Data 360

Advanced Analytics versus Online Analytic Processing (OLAP)

Blog by Philip Russom
Research Director for Data Management, TDWI

The current hype and hubbub around big data analytics has shifted our focus on what’s usually called “advanced analytics.” That’s an umbrella term for analytic techniques and tool types based on data mining, statistical analysis, or complex SQL – sometimes natural language processing and artificial intelligence, as well.

The term has been around since the late 1990s, so you’d think I’d get used to it. But I have to admit that the term “advanced analytics” rubs me the wrong way for two reasons:

First, it’s not a good description of what users are doing or what the technology does. Instead of “advanced analytics,” a better term would be “discovery analytics,” because that’s what users are doing. Or we could call it “exploratory analytics.” In other words, the user is typically a business analyst who is exploring data broadly to discover new business facts that no one in the enterprise knew before. These facts can then be turned into an analytic model or some equivalent for tracking over time.

Second, the thing that chaffs me most is that the way the term “advanced analytics” has been applied for fifteen years excludes online analytic processing (OLAP). Huh!? Does that mean that OLAP is “primitive analytics”? Is OLAP somehow incapable of being advanced?

I personally don’t think so. In fact, depending on how you design and implement it, OLAP can be quite advanced. For example, OLAP is very much about dimensions. In the 90s, eight dimensions was considered an advanced implementation. Nowadays I regularly talk with people who have twenty or more. I realize there’s a difference between advanced and mature. But I have to say that I’ve seen lots of mature OLAP implementations that support hundreds of cubes, hundreds of OLAP reports, and thousands of users. Over the years, different approaches to OLAP (multidimensional, relational, desktop, etc.) have consolidated into a hybrid OLAP, such that most vendor products today are quite mature, feature rich, and flexible.

Here’s another, related issue. While researching a new TDWI report on big data analytics, I ran across a few people (users, consultants, and vendors) who think that “advanced analytics” (or whatever you want to call it) will render OLAP obsolete. Therefore, user organizations should expunge OLAP from their BI portfolios. Uh, no. I don’t see that happening.

In defense of OLAP, it’s by far the most common form of analytics in BI today, and for good reasons. Once you get used to multidimensional thinking, OLAP is very natural, because most business questions are themselves multidimensional. For example, “What are western region sales revenues in Q4 2010?” intersects dimensions for geography, function, money, and time. Discoveries made in OLAP are easily “institutionalized” or “operationalized” (much more so than advanced analytics), so OLAP analyses are repeated over time with consistency. Since dimensions are easily expressed as parameters, an OLAP-based report can be as easy to use as a parameterized report, thereby putting OLAP-based analytics within the comprehension of a vast range of possible end-users.

The scope of discovery of an analytic method seems to be an important concern right now, as seen the current fascination with big data analytics. In that context, a possible limitation of OLAP is that most implementations are tightly coupled to datasets called cubes. If the information someone hopes to discover is not in a cube, then that can be a problem. Even so, so-called relational OLAP can be a solution, and OLAP tools are so friendly nowadays that just about anyone can create a cube. Depending on how an OLAP implementation is designed and which vendor tools are used, a cube can limit the scope of discovery, just as any analytic dataset can – even if it’s multi-terabyte big data.

In my mind, advanced analytics is very much about open-ended exploration and discovery in large volumes of fairly raw source data. But OLAP is about a more controlled discovery of combinations of carefully prepared dimensional datasets. The way I see it: a cube is a closed system that enables combinatorial analytics. Given the richness of cubes users are designing nowadays, there’s a gargantuan number of combinations for a wide range of users to explore.

So, OLAP’s not going away. Users would be nuts to abandon their large investments in such a handy technology. And it’s like most situations in IT. Few things go away. Organizations just keep adding more tools types and best practices to their portfolios. Therefore, user organizations should expect to maintain their useful investments in OLAP, while also digging deeper into other forms of exploratory and discovery analytics.

So, what do you think, folks? Let me know. Thanks!