Building a More Reliable Data Lakehouse in the Cloud
Webinar Speaker: Fern Halper, TDWI VP Research, Senior Research Director for Advanced Analytics
Date: Tuesday, April 25, 2023
Time: 9:00 a.m. PT, 12:00 p.m. ET
As organizations utilize higher volumes of diverse data, cloud data lakehouses are becoming more popular. However, at TDWI, we see that data quality remains a top challenge for organizations moving to the cloud. From null values and duplicate rows to modeling errors and schema changes, data can break for millions of reasons. To combat this for their lakehouse environments, modern data teams are increasingly adopting best practices from DevOps and software engineering to identify and resolve issues—and even prevent this "data downtime" from happening in the first place.
Join this panel webinar with representatives from Databricks and Monte Carlo to learn more about how data engineering teams adopting data lakehouse architectures can reduce the time it takes to detect and resolve data quality incidents and, in the process, build more trustworthy and reliable lakehouse environments.
Topics include:
- The biggest data quality challenges facing companies today
- How to optimize for higher data quality across your lakehouse’s metadata, storage, and query engine layers
- How teams can start tracking and measuring data reliability SLAs, service level objectives (SLOs), and service level indicators (SLIs)
- How to roll out a winning data observability strategy for your lakehouse
- And much more
Guest Speakers
Roberto Salcido
Senior Solutions Architect
Databricks
Roberto is currently a senior solutions architect at Databricks. In a previous life he worked at Mode and greatly enjoyed partnering with business analysts to scalably surface critical insights to their stakeholders. In his free time, he enjoys running, nature, and watching sports.
Shane Murray
Field Chief Technology Officer
Monte Carlo
Shane Murray is the field CTO at Monte Carlo Data, a data observability platform that connects your existing data stack to monitor for freshness, distribution, volume, and schema changes continuously and immediately notify stakeholders once incidents are detected. Before Monte Carlo, Shane was SVP of data and insights at the New York Times, leading 150-plus employees across data science, analytics, governance, and data platforms.
Under his leadership, Shane expanded the team into areas like applied machine learning, experimentation, and data privacy, delivering research and insights that improved the Times’ ability to draw and retain a large audience and scale the digital subscription business, which grew tenfold. Before joining the Times, Shane led data teams in start-ups called Memetric and Accenture helping companies build and scale experimentation programs within those organizations.
Fern Halper, Ph.D.