Business-Driven Data Warehouse Architectures
The evolution TDWI is seeing in data warehouse architectures is due in part to evolving business practices and goals.
By Philip Russom, TDWI Research Director
Data warehouse architectures have been experiencing a rather dramatic evolution in recent years, and they will keep evolving into the foreseeable future. Some of the deepest influences on data warehouse architecture are coming from evolving business practices and business goals. This makes sense because a well-aligned data warehouse is a mirror that reflects the business or other organization that its serves. So you know what to expect, here are the more prominent business drivers behind evolving data warehouse architectures today.
Reporting is more important than ever to business operations. In many businesses, reports are the primary mechanism for disseminating operational information daily. If reporting stops, business processes stop or slow to a crawl. For that reason alone, businesses and other organizations need to protect their traditional data warehouse architectures, which are largely designed to provide data for reports and related deliverables such as management dashboards, performance management, and online analytic processing (OLAP).
What reporting does for established operations, discovery analytics does for new business development. Businesses are moving into open-ended, discovery-oriented analytics so they can discover new facts about the business, its customers and partners, and competitive pressures. Discovering "new things" is unlikely with the well-known, heavily prepped data that comes from the average report-oriented data warehouse. Many businesses are extending their data warehouse environments to include new standalone data platforms that are conducive to discovery analytics, such as columnar databases, data appliances, NoSQL databases, and Hadoop.
Big data is about business analytics, not the bigness of data per se. Multiple TDWI surveys show that technology and business people alike agree that the primary path to business value from big data leads through discovery analytics. The current frenzy to figure how to capture, store, and process big data is just a means to the valuable end of analyzing that data for organizational advantage, as noted above. The rush to satisfy the data requirements of business analytics (whether with big data or traditional enterprise data) is the leading driver for change in data warehouse architectures today.
Departmental requirements are sufficiently unique that departments increasingly build their own shadow programs for BI and analytics. This is because most analytic applications have a departmental (or business-unit) bias. Sales and marketing staff need to control customer analytics to maximize benefit. Likewise, procurement needs to control supply chain analytics and the financial department should control financial analytics. As analytics applications proliferate, many are funded and sponsored at the department level, not through a central IT or BI/DW program. To keep departmental systems from becoming data silos, data warehouse architectures in such organizations are becoming more federated and logical so that the architectural plan stretches across multiple systems in multiple departments.
Business requirements for real-time data continue to intensify. This slow-moving trend has been with us for almost 20 years now. Its most visible manifestation is operational BI, a very common practice nowadays, where fresh operational data is fetched quickly multiple times during the business day and refreshed into reports and dashboards for managers who lead time-sensitive business processes. The leading edge is now event processing, where data representing the most recent business events is received and processed, with reaction timeframes of hours or minutes (by humans), sometimes seconds or milliseconds (by software). Events include transactions, customer behaviors, operational performance, potential fraud, and so on. Traditional data warehouse architectures were designed for "data at rest," but real-time functionality for "data in motion" can be retrofitted into the architecture. Even so, an increasing number of user organizations choose to acquire additional, standalone systems for the increasingly short time frames within which businesses want to operate.
Adoption of analytic, big-data, and real-time technologies is also an opportunity for businesses to rethink the economics of data warehouse architectures. By now, you've probably noticed that most of the architectural changes mentioned here involve adding more platform types to an extended data warehouse environment. One way to segregate available platforms is by approximate price or total cost, using whatever metric you like. Most users interviewed by TDWI agree that the traditional relational data warehouse is the most expensive, and Hadoop products are the least expensive, with columnar and appliance platforms in the middle. Many organizations are restructuring their data warehouse architectures according to this rough financial plan. Restructuring usually entails moving as much data as makes sense from the core data warehouse to other platforms. In particular, data staging, archives of detailed source data, and unstructured and semi-structured data are trending toward Hadoop or appliances. Financial considerations aside, these data types are better served on newer platforms. Furthermore, this strategy frees up capacity for new solutions that run best on a relational warehouse -- namely, production reporting, dashboards, performance management, and OLAP.
For more information about trends in data warehouse architectures, attend the TDWI Webinar Evolving Data Warehouse Architectures on April 15, 2014. Register online here. You may also be interested in the upcoming TDWI Best Practices Report: Evolving Data Warehouse Architectures in the Age of Big Data which will be available at http:// tdwi.org/bpreports after April 1, 2014.