LESSON - The Perils of Data Warehouse Success
By Kim Stanick, VP Product Marketing, DATAllegro
A successful data warehouse is a double-edged sword. Because growing demand increases pressure on limited resources, data warehouse success involves a constant weighing of priorities. Disappointment can result when concurrent demands cannot be feasibly met within the budget.
The problem may be a simple situation of budget cycles, but it may also be an indicator of a more chronic condition. As a business grows and evolves, so must the data warehouse. Diminishing effectiveness can be very subtle and difficult to recognize. To overcome it, one must recognize the leading indicators. Here are a few to watch for:
A growing backlog of enhancement requests. User requests for additional data or queries cannot be fulfilled because of limited capacity.
Limits placed on data warehouse use. Limits are placed on things like ad hoc queries, concurrency, access windows, query execution time, complexity of queries (no joins), non-priority use (exploration), additional users, etc.
New applications never leave the “back burner.” There are desirable new applications that leverage existing data, but the current platform can’t handle the additional workload.
Historical data is archived too soon. Users would like to access deeper history but there is insufficient capacity to allow it.
It may be an acceptable condition for a successful new data warehouse to be slightly “behind the curve” (more requests coming in than can be handled quickly), but a successful mature data warehouse should not have a huge and growing backlog of unmet needs, because its viability has been established and planning efforts are more predictable. If a mature warehouse does have a growing backlog, the situation should be analyzed to find out why.
If you notice any of these leading indicators, your data warehouse’s effectiveness may be diminishing. The cause often boils down to lack of scalability in one form or another, and can be explained by what is known as the data warehouse capability gap. The capability gap exists when the rate of data growth outpaces the data warehouse platform capability. The effects of this condition are increased total cost of ownership (it costs more to get smaller increments of additional performance) and, eventually, hitting a physical technology barrier (the platform has literally reached its maximum configuration limits). Both of these situations lead to non-consumption (usage limits, delayed implementations, etc.) and therefore diminishing data warehouse effectiveness.
Keeping step with technology advancements is an important part of an overall strategy to avoid the problem of diminishing data warehouse effectiveness. Here are a few tips on how to do that:
Harness rapid improvements in price performance. Given today’s climate of rapid technology improvements and commoditization, good price performance is much easier to ensure. Designing an environment to nimbly take advantage of price performance gains will give you more mileage out of your existing data warehouse budget.
Reduce the amount of integration. Maturing of open standards and the trend toward “pre-integration” means you can shift more of the integration burden onto the vendor. This will allow you to spend less time and effort rolling out new functionality.
Ensure platform scalability.
Lack of scalability (concurrency, complexity, capacity) will limit your data warehouse’s effectiveness. When selecting data warehouse technology, be sure it can achieve linear scale-up (double the platform and you double the performance) and scale-out (double the platform and the workload and you get the same performance).
By resisting the temptation to limit data warehouse usage and instead taking advantage of improved price performance, a data warehouse’s effectiveness can be maximized and its success can be long-lived.