TDWI FlashPoint Newsletter TDWI FlashPoint Newsletter

Predictive Analytics: Making it Work

A Practical Guide to Analytics: Putting People, Process, and Technology to Work

TDWI Dimensional Data Modeling Primer: From Requirements to Business Analytics

Data Quality 

White Paper
The Top 10 Reasons for Choosing Open Source Data Integration

Quality White Paper
Is Your Approach to Multi-domain Master Data Management Upside-Down?
Data Quality White 

Paper Unique Requirements of Product Data Quality
Presented by Philip Russom
Event date: September 8, 2010

Data Quality White Paper Strategies for Designing and Architecting Performance Dashboards
Presented by Wayne Eckerson
Event date: September 14, 2010

TDWI Experts is a twice-monthly e-newsletter where BI/DW thought leaders share opinions and commentary about relevant industry topics and the latest technologies.

Article Image

August 5, 2010

SQL-Based Agile Analytics

Philip Russom
Senior manager of TDWI Research

Topic: Agile BI and Advanced Analytics

As you've no doubt noticed for the last few years, many enterprises are moving strongly into various forms of advanced analytics. They do it for the usual reasons: to understand constantly changing business environments, to identify bottom-line costs, and to discover new top-line opportunities for sales and product development. These are key to surviving and thriving in any economy, yet intensified by the ongoing recession.

In response, many firms are turning to three different technologies and practices:

Advanced Analytics. There are different analytic methods users can choose as they move beyond basic OLAP-based methods and into advanced analytics. Some users choose predictive analytic methods based on data mining, statistics, clustering, and so on. The majority of users, however, seem to be choosing SQL-based methods.

Extreme SQL. Users are diving deeper into SQL because it's a technology they know and trust, and they can leverage the SQL-based tools and skills they already have. The catch is, to make SQL analytic, you have to push it to extremes. Extreme SQL, as I call it, typically involves complex SQL routines with multi-pass logic, distributed queries, and lots of clauses, developed using an iterative method that starts with ad hoc queries.

Agile BI. This is a user-oriented best practice that seeks to speed up DW/BI development and delivery by jettisoning team bureaucracy and liberating single DW/BI team members to work directly and iteratively with business users who can articulate the business requirements clearly.

Sponsored Links

My point is that users -- for a few years now -- have been putting these three practices together to spawn a new practice that I call SQL-based agile analytics. I've also heard it called load-and-go analytics. Regardless of what you call it, here's what it does in terms of data preparation:

  • Users quickly load a few terabytes of data, often in response to a recent business event that needs study.
  • This large volume of data is mostly raw operational data, still in its original schema. It's probably drawn from multiple sources, each with a different schema.
  • Note that this reactive method doesn't allow time and resources up front for data transformation, cleansing, or remodeling. Instead, these are expressed via complex SQL, and analysts may do these later, once they gain an initial understanding of the data and the answers it can yield.

Being SQL-based, this form of agile analytics depends on lots of querying:

  • Analysts execute many ad hoc queries until the data reveals the answers they need
  • The queries get more complex with each iteration by a business analyst
  • Analysts apply SQL's more advanced capabilities to compensate for the raw state of data, resulting in lots of WHERE clauses, table joins (often multi-way), etc.

The point of SQL-based agile analytics is to get answers to analytic questions as soon as possible after a critical business event has occurred. For example, new forms of customer churn keep popping up, and it behooves a firm to understand each one ASAP to retain customers before they're lost. Likewise, drops in grid performance (in telco) or production yield (in manufacturing) are common topics for advanced analytics.

SQL-based agile analytics is all about agility. When a new business problem or opportunity comes up, the BI team must quickly shift focus to it. In response, the business team must react as soon as possible, based on insight from the BI team. For example, TDWI has interviewed BI team members who learned of a critical issue one morning and had a terabyte-scale analytic dataset with preliminary query results within hours. In turn, this enabled the business to respond that same afternoon with new incentives to retain customers.

Despite its current focus on tactical problems, TDWI sees SQL-based agile analytics increasingly used as a primary analytic method, not just a reactive one, so it's possible that you'll be doing it soon if you aren't already. This column just reveals the tip of the iceberg. To learn more, read TDWI's recent Checklist Report on Data Requirements for Advanced Analytics, available at

Philip Russom is senior manager of TDWI Research. Phil can be reached at .

Copyright 2010. TDWI. All rights reserved.

TDWI Membership TDWI Membership TDWI Membership