LESSON - Data Aggregation—Seven Key Criteria to an Effective Aggregation Solution
By Rich Ghiossi, VP Product Management and Marketing, HyperRoll
Companies today are faced with reporting and data analysis applications that are hamstrung by performance. Market and regulatory pressures are placing company CIOs in difficult positions. Furthermore, the amount of data being collected is increasing—as are the demands for more detailed analysis and reporting. Among the areas hardest hit by these challenges are:
- The need for timely financial close reporting
- Accurate sales and marketing data to develop more profitable customers, and
- Real-time disclosure to meet compliance regulations.
In response, organizations have resorted to all manners of stop-gap measures to coax performance out of BI applications—with little to no success.
What Is Data Aggregation and Why Should You Care?
Data aggregation is any process in which information is expressed in a summary form for purposes such as reporting or analysis. Ineffective data aggregation is currently a major component that limits query performance. And, with up to 90 percent of all reports containing aggregate information, it becomes clear why proactively implementing an aggregation solution can generate significant performance benefits, opening up the opportunity for companies to enhance their organizations’ analysis and reporting capabilities.
But how do you go about selecting an effective aggregation solution? First, let’s review the typical quick fixes that are used to improve query performance today. Then we’ll review the seven key criteria that will help companies evaluate an effective data aggregation solution.
An effective data aggregation solution can be the answer to your query performance problems. Free your organization from the arbitrary restrictions placed on your BI infrastructure as a result of quick fixes, and turn reporting and data analysis applications into strategic, corporate-wide assets.
Don’t Settle For Quick Fixes
Traditional approaches to solving ineffective data aggregation are no longer enough:
- New server hardware. BI applications relying on RDBMS infrastructure perform only incrementally better when additional hardware is introduced. Clearly, the added costs of capital equipment acquisition do not yield the exponential performance improvements required by today’s operational BI applications.
- Partitioning, de-normalization, and creating derivative data marts and OLAP cubes. Although they are more difficult to implement than many of the other quick fixes, these tried and true techniques have been used for many years to improve query performance. But the reality is that tuning requires time, and is a continuous process that will not improve query performance enough to deliver the timely reports businesses require.
- Report caching and broadcasting. While caching may provide some performance relief, global organizations servicing geographically dispersed users find it increasingly difficult to allocate sufficient blocks of time to process these reports. The result of report caching and broadcasting is stale, canned reports that are hours or days old—providing limited benefit in an environment where ad hoc, on-demand reporting is a requirement.
- Summary tables. Anecdotal evidence suggests that organizations build only a limited number of summary tables that cover a very small percentage of all possible user requests. The maintenance burden introduced by even several dozen summary tables quickly outweighs their incremental benefit.
The following key criteria were developed in collaboration with leading BI analysts and practitioners. Companies using these new criteria can now evaluate innovative technologies that have the capability to address ineffective data aggregation.
Seven Key Criteria to Selecting an Effective Aggregation Solution
- Enterprise-class solution. Enterprise-class solutions share a number of characteristics that should be required by any company serious about business intelligence. These solutions are architected to support dynamic business environments. They provide mechanisms to ensure high availability and easy maintenance, they allow for multi-server environments, and they support activities such as backup and recovery. They typically also have more than one way to interface into the system.
- Once designed, the solution is easily maintainable; little to no management is necessary.
- The solution must be able to adapt to ever-changing business requirements by having the ability to support changing hierarchies and structures (e.g., attribute to a dimension).
- The system must leverage existing IT investments in BI environments and DB infrastructures.
- Integration with the existing applications and systems must be simple. At a minimum, there must be a set of published APIs to popular BI applications and DB systems.
- Flexible architecture. A flexible architecture is one that allows for exponential growth and flexibility. This allows the solution provider to be ultra-responsive to the shifting needs of its customers—extremely important, as the business environment is always changing.
- The solution should use standard industry models to support complex aggregation needs.
- The solution should support all types of reports and reporting environments.
- The ideal architecture should optimize pre-aggregation with aggregation on the fly.
- Performance. Performance refers to the speed, responsiveness, and quality of the application. Queries that take hours to run are no longer acceptable to business users. Moreover, the data they receive must be fresh. The market demands current information in seconds to minutes in order to make judicious business decisions.
- Query performance must be virtually instantaneous.
- Users will not be required to trade excessive build (pre-aggregations) times for good query performance.
- Performance must be predictable—not dependent on users, data, or time-of-day variations.
- Scalability. The amount of data being collected is increasing. And, with the proliferation of technologies that facilitate gathering even more transactional data such as RFID, scalability will become even more important to plan for in the future.
- The solution should support billions of rows and tens of dimensions with millions of members.
- Incremental updates should take minutes per day to enable near-real-time processing.
- The solution should support hundreds to thousands of concurrent users.
- Fast implementation. With implementation costs running at two to three times the price of software, it is imperative to evaluate implementation time as well as a product’s reliance on expensive IT resources.
- The system should have a proven implementation methodology and approach.
- The GUI tool should provide users with a wizard to speed development.
- The solution should require little to no training.
- Utility management and control processes should be in place.
- Efficient use of hardware and software resources. Solutions need to be evaluated on their ability to use hardware and software resources efficiently. Systems that promise significant improvements may also require exponentially more resources—which can be unanticipated and costly.
- There should be minimal to no increase in CPU/processing requirements.
- Minimal to no increase in storage requirements (e.g., no more than 20 percent of the storage required to store your fact data).
- The solution should provide embedded compression and caching mechanisms.
- Price/performance. The criteria used in selecting the technology requirements must coincide with the value of the solution to make it worth implementing. Making financially responsible decisions is no longer just a goal, but rather a necessity.
- The solution must be priced to scale with the needs of your business.
- There should be no hidden long-term costs associated with supporting the solution.
An effective data aggregation solution can be the answer to your query performance problems. Free your organization from the arbitrary restrictions placed on your BI infrastructure as a result of quick fixes, and turn reporting and data analysis applications into strategic, corporate-wide assets. For more information on this topic and on how to evaluate your aggregation solution’s effectiveness, visit www.hyperroll.com/7_Key_Criteria.