High-Performance Data Warehousing: The Final Frontier
Are you ready for high-performance data warehousing? That's the 64-zettabyte question.
- By Stephen Swoyer
- October 30, 2012
Are you ready for high-performance data warehousing (HiPer DW)?
That's the 64-zettabyte question posed by Philip Russom, research director for data management with TDWI Research, the research arm of TDWI.
Russom's new Best Practices report -- High Performance Data Warehousing -- tackles both the what and the what-for of HiPer DW, which he helpfully distinguishes from big data, real-time data, streaming data, multi-structured data, analytic data, and so on. "[J]ust about everything we do in BI, DW, DI, and analytics nowadays has some kind of high-performance requirement, for both business and technology reasons," he writes.
It's not just the data warehouse. The concept of high-performance data warehousing encompasses the entire BI, DW, DI, and analytics technology stack. Obviously, data warehouses are under pressure to scale up to big data volumes, but enterprise BI platforms are under as much pressure to scale large communities of concurrent users, plus the thousands of reports they need.
Data processing workloads for advanced analytics are demanding in their complexity, even when operating on modest data volumes. Yet, the greatest number of technical users are under the gun to deliver more data, reports, and analyses via real-time operation, as seen in fast-paced business practices such as operational BI, business performance management, and management dashboards.
According to Russom, it's a matter of speed, scale, complexity, and concurrency -- in some combination. "High-performance data warehousing ... is primarily about achieving speed and scale while also coping with increasing complexity and concurrency," he writes. "These are the four dimensions that define HiPer DW. Each dimension can be a goal unto itself; yet, the four are related. For example, scaling up may require speed, and complexity and concurrency tend to inhibit speed and scale."
Each of the attributes Russom describes constitutes a Big Problem relative to conventional data warehousing. Collectively, they comprise an unprecedented problem. Data management (DM) practitioners -- in tandem with (and frequently in opposition to) their programming colleagues in the non-DM world -- have developed technologies that permit them to deal with most of these issues in isolation -- or, to some degree, in combination. However, scaling a data warehouse for both size and performance while simultaneously addressing bleeding-edge complexity requirements and supporting a high degree of concurrency is becoming a common challenge.
Engineering a HiPer DW solution is an issue that most large enterprises are going to find themselves grappling with, however. Furthermore, Russom cites several cutting-edge applications of analytic technologies that he says will tax existing DW systems along the fault lines of speed, scale, complexity, and concurrency.
Just as the now-common practice of operational BI has pushed the BI community closer and closer to real-time operation, "[T]he emerging practice of operational analytics will ... [likewise push] a variety of analytic methods [closer to real-time operation]. Many analytic methods are based on SQL, making the speed of query response more urgent than ever. Other analytic methods are even more challenging for performance due to iterative analytic operations for variable selection and reduction, binning, and neural net construction," Russom writes.
"Out on the leading edge, events and some forms of big data stream from Web servers, transactional systems, media feeds, robotics, and sensors; an increasing number of user organizations are now capturing and analyzing these streams, then making decisions or taking actions within minutes or hours."
The good news is that DM professionals don't seem to be afraid of the challenge. Most, in fact, perceive HiPer DW as an opportunity, not as a problem. Nearly two-thirds (64 percent) of respondents to a recent TDWI survey described HiPer DW as an opportunity -- precisely because it "enables new, broader, and faster business practices."
Russom's 36-page report includes a thorough consideration of the costs, benefits, and challenges of HiPer DW. You can download it here.