Building a Successful Data Lake in the Cloud
TDWI Speaker: Philip Russom, Senior Research Director for Data Management
Oracle Speaker: Erik Bergenholtz, Vice President Software Development Big Data Cloud
Data lakes on Hadoop have come on strong in recent years because they help many types of user organizations – from Internet firms to mainstream industries – capture big data at scale and analyze or otherwise process it for business value.
On the one hand, Hadoop-based data lakes have proved themselves valuable in mission-critical use cases, such as data warehousing, advanced analytics, multichannel marketing, complete customer views, digital supply chains, and the modernization of data management in general. On the other hand, data lake early adopters have hit a ceiling, held back by Hadoop’s numerous omissions and weaknesses in key areas such as developer tools, data protection, flexible storage, metadata management, logical containers, and resource management. In a similar trend, many data lake users are also migrating data, applications, and users to the cloud; the challenge there is to fully leverage cloud’s powerful elasticity and new best practices, instead of settling for a mere “lift and shift” migration.
In this TDWI webinar, we’ll consider platform and tool options that can both fill the holes in Hadoop and extend cloud-based data storage with functionality relevant to data lakes. In this webinar, you will learn about:
- Data lakes: what they are and what they can do for your organization
- Real-world use cases for data lakes in operations, analytics, compliance, etc.
- Hadoop’s strengths and weaknesses: how these affect data lake success
- Options: how diverse platforms, tools, and clouds may be combined in an integrated architecture for Hadoop-based data lakes and other hybrid practices
- How cloud object storage can be a useful replacement of or complement to the Hadoop Distributed File System
Philip Russom, Ph.D.