Experts Agile DW Environment
To create an agile environment, organizations must make it easier to deploy analysis tools and to quickly enhance their underlying data warehousing infrastructure. We explain some of the technologies and techniques that can help you make this happen.
- By Mike Schiff
- June 3, 2010
Data warehouses were once considered a major undertaking that involved a long-term implementation effort and significant resources. Now they often need to be quickly deployed so that organizations can act upon new opportunities, identify cost savings, incorporate decision-making capabilities within operational systems, and, in general, achieve or enhance their competitive advantages.
According to Merriam-Webster's Collegiate Dictionary, the definitions of agile include "having a quick resourceful and adaptable character." Although Merriam-Webster uses "an agile mind" as an example, I believe the definition also describes an agile data warehousing environment that enables users to quickly perform analyses in support of their decision-making processes. Furthermore, it also describes decision-making algorithms embedded in operational systems (e.g., customized offerings of additional products to Web-site purchasers or checking names against a terrorist watch list). This article focuses on ways to create a more agile data warehouse environment that facilitates end-user decision making.
Almost all of us have experienced situations where users wanted to augment their data warehouses with another data source, but by the time a decision was made to include this data, the deadline for acting on the opportunity had long passed. In these situations, typified by one-time "do or don't do" decisions, a 75 percent solution today is worth far more than a 100 percent solution only available after the fact.
This is not to say that any and all data should freely be sourced into an enterprise data warehouse (EDW). Rather, alternatives should be available to support decisions that require data not available from the EDW. Although the integrity of the data within an EDW should be sacrosanct, compromises can often be made in support of one-time, or limited-scope (e.g., departmental) analysis efforts. As long as users are aware of the potential data quality compromises, this can be done using special-purpose data marts, data warehouse appliances, analytic databases, or enterprise information integration (EII) implementations that directly access data from operational systems.
Even if the source data is available within the EDW, it may be more expedient to move the data to another platform for analysis. This could occur, for example, if the EDW is running at full capacity or another platform (such as a special-purpose appliance) could greatly outperform the EDW.
To create an agile environment, organizations must make it easier to deploy analysis tools and to quickly enhance their underlying data warehousing infrastructure. Fortunately there are a variety of technologies and techniques that facilitate this including the following.
Cloud computing and third-party hosting of on-demand computing power, associated storage, and business intelligence tools allow organizations to quickly deploy a data warehouse platform with minimum upfront costs or long-term commitments. It could be used as an adjunct to the existing in-house environment to expand overall analysis capabilities on an "as-needed" basis. Several vendors also market customer-centric data through an on-demand channel.
Open Source software including data integration, database, and business intelligence offerings permit organizations to economically obtain these capabilities without necessarily increasing their license fees for proprietary software used in "mission-critical" enterprise data warehouse implementations.
Analytic (e.g. column-oriented) databases are optimized for analyzing large quantities of data and can dramatically reduce query run-times.
Virtualization techniques can leverage existing hardware and reduce or eliminate the need to acquire additional computing power.
Organizational bureaucracy can be reduced by establishing or modifying procedures associated with special, one-time requests. For example, although a strong change management process is important for an enterprise data warehouse, it may be overkill for one-time analysis efforts. Along the same lines, the use of software from vendors not on the "approved vendor" list (including perhaps open source vendors) might also be allowed, at least as long as the one-time request does not evolve into a regularly run production application.
Reduce dependence on IT by creating parameter-driven report and analysis templates that can be used by less-skilled employees. Organizations should also consider establishing a vehicle for generalizing (i.e., converting them to parameter-driven requests) valuable but somewhat specific BI analyses so that they can have applicability to other situations.
Enhanced search and delivery capabilities allow users to find relevant templates and existing analyses and deliver results directly to their mobile devices. Similar to reducing dependence on IT, this is applicable to almost all platforms within an organization's overall data warehousing environment.
The Final Word
An agile data warehousing environment requires an overall architecture that supports the integrity of enterprise data warehouses while still providing the flexibility to quickly react to special requests. To accomplish this, an organization should establish a flexible data warehousing architecture that may include both in-house and on-demand capabilities. They should also consider relaxing the change management process and deploying software tools that, though perhaps not appropriate for enterprise-wide applications, are well-suited for special purpose analyses needs. Strong search and delivery capabilities will further serve to make the data warehousing environment more agile.
Michael A. Schiff is a principal consultant for MAS Strategies. He can be reached at [email protected]