Modern Data Architecture with Apache Hadoop: The Hybrid Data Warehouse
March 1, 2017
Apache Hadoop didn’t disrupt the data center, the data did.
Shortly after corporate IT functions within enterprises adopted large-scale systems to manage data, the enterprise data warehouse (EDW) emerged as the logical home of all enterprise data. Today, every enterprise has a data warehouse that serves to model and capture the essence of the business from their enterprise systems.
The explosion of new types of data in recent years—from inputs such as the Web and connected devices, or just sheer volumes of records—has put tremendous pressure on the EDW.
In response to this disruption, a growing number of organizations have turned to Apache Hadoop to manage the enormous volumes of data while maintaining coherence of the data warehouse, along with data virtualization, which provides a single logical data access abstraction layer across multiple data sources enabling rapid delivery of complete information to business users.
This paper discusses Apache Hadoop, its capabilities as a data platform, and how it supports hybrid data warehouses, in combination with data virtualization, to deliver a single unified and coherent logical view on all enterprise data assets in and outside the data lake, minimizing unneeded data replication.