Using DataOps for Data Pipeline Engineering Quality
TDWI Speaker: David Loshin, President of Knowledge Integrity
Date: Thursday, June 20, 2019
Time: 9:00 a.m. PT, 12:00 p.m. ET
Data pipelines facilitate information flows and data exchange for a growing number of operational scenarios, including data extraction, transformation, and loading (ETL) into data warehouses and data marts, data migrations, production of BI reports, and application interoperability. When data engineers develop data pipelines, they may devise a collection of tests to guide the development process, but ongoing tests are not often put in place once those pipelines are put into production.
At development time, you need to certify that developed ETL processes do what they were intended to do. In production, you need continuous monitoring to identify process or data flaws that impact operational quality. Yet as the number of applications grows, it becomes increasingly complex to develop and test data pipelines to ensure the quality of the data flowing through them. In this webinar, we explore the concept of DataOps, and how DevOps concepts can be applied to automate key aspects of data pipeline quality and ongoing monitoring across the entire data lifecycle.
Attendees will learn about:
- The need for test automation
- Instituting production data monitoring
- Leveraging DataOps for data pipeline quality
- Managing regression packs for quality assurance