Modernizing the Logical Data Warehouse
Why and how you should upgrade your data warehouse architecture's logical, virtual, semantics, and services layers.
- By Philip Russom
- October 14, 2019
Many approaches to the logical data warehouse (LDW) have been with us for decades now. As with all other components of data warehouse architectures, the logical data warehouse is evolving at an accelerating pace to support new data sources, structures, ingestion methods, and near-time latencies, as well as the broadening range of analytics and other use cases that enterprises increasingly demand. Similarly, the logical data warehouse is incorporating recent advances in data semantics, service architectures, integration methods, and data platforms.
Putting all that together reveals that the modernization of the logical data warehouse is well underway, driven by the need to better serve new data types, new data platforms, and new business use cases. Organizations need to plan their modernization of the logical data warehouse to achieve high-value goals such as the digital enterprise, business transformation, real-time analytics, and other modern data-driven business practices, both analytic and operational.
How does the logical data warehouse compare to a traditional, physical warehouse?
TDWI has always defined the data warehouse as a data architecture of multiple layers. Some layers are inherently physical, as in the systems architecture where hardware servers and software servers combine to form data platforms where data is stored. Other layers are intrinsically virtual, as in the logical data warehouse, where data types, structures, names, and relationships are documented in data semantics ranging from modern catalogs to tried-and-true metadata (whether technical, business, or operational).
Compared to the physical layers of the data warehouse architecture, the logical data warehouse depends more on techniques that define data without persisting it, as in data virtualization, data modeling, business metadata, and various approaches to data views. The logical data warehouse gets most of its enabling technologies from the virtual functions of data integration and application integration platforms.
For many warehouses, the logical data warehouse is also where most users and applications access data, which again requires considerable integration infrastructure. Although the logical data warehouse is increasing the point of entry, the data being accessed is in the physical layers of the warehouse or other systems.
What kinds of business use cases does the logical data warehouse support?
Most LDW use cases involve data that must be very fresh to be of value to a business process. For example, the management dashboard is a killer app for the LDW, especially when managers need to refresh their dashboard data every hour or so. The LDW can draw data from a warehouse, operational applications, and many other sources to recalculate performance metrics in real time (or close to it). That's why real-time integration and delivery is typically required of the LDW, as is the ability to extract and merge data from multiple, diverse sources (also known as data federation). Similarly, LDW enables real-time use cases in analytics, business monitoring, and self-service data prep or visualization.
Why is it important to modernize a logical data warehouse?
The warehouse's systems layer has seen significant change (and hype) in recent years because it now includes new types of data platforms (Hadoop, NoSQL, cloud-based databases) and computing platforms (clouds and clusters). The virtual layer is evolving, too, due to advancements in data virtualization, in-memory processing, data modeling, and modern data semantics. Put the two complementary layers together and you have a complete data warehouse architecture, where each layer has its own unique and pressing needs for modernization.
For many user organizations, recent efforts in data warehouse modernization have largely focused on the systems architecture (because of its dramatic changes), often resulting in replatforming (which introduces new data platforms to the increasingly multiplatform data warehouse environment). For other organizations, modernization focuses on enriching the LDW with modern data semantics.
For example, warehouse programs tend to depend on technical metadata but are now under pressure to develop support for business metadata, which is required for modern practices such as self-service data discovery and data prep. For the real-time and federated LDW use cases mentioned above, many organizations are relying more on data virtualization platforms, which are usually semantics driven.
How do you modernize a logical data warehouse?
From these LDW examples, you can see that LDW modernization usually entails beefing up the technologies that improve LDW performance (e.g., in-memory processing, data pipelining, hardware upgrades), integrated with a broader range of apps and databases (via integration functions), and more creative with logical modeling (as enabled by data virtualization and modern data semantics). However, LDW modernization may also add new functionality for embedded analytics or data cataloging. Because logical modeling makes or breaks the design of an LDW, most organizations also modernize logical designs to make them represent more business entities and data sources as well as to achieve greater optimization for queries and integration operations.
For more information, view the 2019 TDWI Webinar Modernizing the Logical Data Warehouse. Portions of this article are based on that presentation.
For background, see the Upside article Defining the Logical Data Warehouse.
Philip Russom is director of TDWI Research for data management and oversees many of TDWI’s research-oriented publications, services, and events. He is a well-known figure in data warehousing and business intelligence, having published over 600 research reports, magazine articles, opinion columns, speeches, Webinars, and more. Before joining TDWI in 2005, Russom was an industry analyst covering BI at Forrester Research and Giga Information Group. He also ran his own business as an independent industry analyst and BI consultant and was a contributing editor with leading IT magazines. Before that, Russom worked in technical and marketing positions for various database vendors. You can reach him at firstname.lastname@example.org, @prussom on Twitter, and on LinkedIn at linkedin.com/in/philiprussom.