By using tdwi.org website you agree to our use of cookies as described in our cookie policy. Learn More

RESEARCH & RESOURCES

Checklist Report

TDWI Checklist Report | Designing and Operating a Scalable Enterprise Data Lake

March 21, 2019

Data lakes were originally created with the idea that by using modern distributed data computing and storage architectures based on Hadoop, organizations could take advantage of the greater flexibility, more cost-effective computing power, and economical storage to handle big data analytics use cases that couldn’t be managed by traditional enterprise data warehouses (EDWs).

However, early attempts at creating data lakes allowed self-service users to add new data sources into the lake with no governance mechanism, so they quickly became little more than a glorified dumping ground.

In recent years, though, innovations have allowed the data lake to evolve into a coordinated and governed environment for accumulating shared data resources that can be optimally used for competitive advantage. Yet there are many challenges when designing an enterprise data lake that is scalable, sustainable, and governable, while still maintaining flexibility and agility. As a combination of static and real-time data sources are fed into the data lake, its management and operation have become even more complex.

This checklist examines these issues and provides guidance about how to overcome the challenges. It also provides a number of recommendations that will support the design and development of an enterprise data lake that is not just sustainable but will also scale as data volumes continue to explode.


Your e-mail address is used to communicate with you about your registration, related products and services, and offers from select vendors. Refer to our Privacy Policy for additional information.

TDWI Membership

Get immediate access to training discounts, video library, research, and more.

Find the right level of Membership for you.