Prerequisite: None
The data lake has arisen as an effective design for organizing big data as well as extending larger architectures such as the data warehouse.
According to TDWI’s 2020 Data and Analytics Survey, users’ data management priorities for 2020 include supporting advanced analytics, correcting existing data architectures, providing flexible data access via self-service, and handling increasing data volumes. All these and other user priorities are ably enabled by a data warehouse tightly integrated with a data lake.
In fact, over half of organizations responding to recent surveys from TDWI now have data lakes in production. Survey responses also reveal that data lakes are most often deployed for four kinds of use cases: advanced analytics and big data capture, data warehouse extension and modernization, self-service data practices, and operational data ecosystems (for digital marketing, online supply chain, etc.).
In response to these user priorities and design directions, TDWI sees rapid adoption of the integrated data warehouse and data lake. Among other things, the integrated warehouse/lake is a challenging exercise in data design patterns and data architecture. On the local level, both warehouse and lake need thoughtfully constructed internal micro designs for data models (plus schema on read), special structures (dimensions, time series), sandboxes, and in-storage data processing. On the global level, integrating the data warehouse and data lake relies on a macro data architecture that is often multiplatform and hybrid, stitched together with hefty data integration solutions.
This talk will draw from TDWI Research reports to canvass trends, drivers, and best practices concerning the integrated data warehouse and data lake. The talk focuses on cloud and hybrid data architectures, plus use cases in advanced analytics, data warehousing, data integration, and self-service data practices. The point is to help business and technology people design and integrate their warehouses and lakes effectively for technical efficiencies and business advantage.