Media companies are increasingly reliant on data-driven insights to make critical business decisions. The complexity of the type of data, the rate at which data is being captured, and the increase in the number of data producers and consumers makes it very difficult to provide these critical insights quickly. At the same time, it is crucial that the company governs its data so it can empower analysts and data scientists to select trustworthy data sources and also keep a tab on who is using which type of data to drive business decisions. Applications leveraging machine learning and artificial intelligence are becoming the norm, and there will be added scrutiny on the type of data used to train those models.
This talk will focus on the modular database framework created by the Data Platforms organization at Disney to help with self-service data management, trusted data use, built-in governance oversight on types of data being consumed by different teams, and overall observability of the ecosystem.
This framework provides a database, source code control with CI and CD, and orchestration for scheduling jobs. It also includes clear identification of who is using the data and how much, as well as default retention policies to keep cloud storage costs under control.