By using tdwi.org website you agree to our use of cookies as described in our cookie policy. Learn More


The October TDWI Virtual Summit has concluded, but on-demand access is available for previously registered attendees through April 13, 2022.
Click the login button below to access all sessions and content.


Join us for an upcoming summit, or check out our full calendar of virtual training opportunities.

Data Quality and the Data Science Pipeline (With Audience Q&A)

October 14, 2021

Prerequisite: None

Tamraparni Dasu

Ph.D.

Lead Inventive Scientist, Data Science and AI Research

AT&T Chief Data Office, USA

This session will include a moderated Q&A featuring questions from the live audience.

As the world moves toward automation by running statistical, machine learning, and AI algorithms on exponentially increasing amounts of data, the ability to extract actionable insights depends on the quality of the data. Data science pipelines that support ML/AI activities focus primarily on scale and speed of data delivery often at the cost of quality. Data quality issues frequently manifest as anomalies, and if not detected and addressed promptly, contaminate data in warehouses and data lakes, leading to erroneous conclusions and expensive and time-consuming data reconciliation and repairs. In this talk, we discuss the process of ensuring data quality by managing and addressing anomalies throughout the entire data science pipeline with examples drawn from real life.

Subscribe to Receive summit updates via email