Executive Summary | Unifying Data Management and Analytics Pipelines
Executive summary for the TDWI Best Practices Report: Unifying Data Management and Analytics Pipelines
- By James G. Kobielus
- March 21, 2022
Enterprises require a wide range of platforms, tools, skills, and techniques for operationalizing data and analytics models successfully into production applications.
Robust back-end development, testing, and operationalization ensure that analytics are always accurate, relevant, and fit for purpose. DataOps pipeline processes continuously integrate, transform, and prepare data for deployment into analytics applications. MLOps pipelines handle the continuous building, training, serving, and optimization of machine learning, deep learning, natural language processing, and other statistical models.
There is considerable potential for growth in the adoption of unified DataOps and MLOps pipelines in today’s enterprise environments. Both leverage and extend enterprise investments in unified analytics, data engineering, data governance and curation, data science and statistical modeling, and metadata management. Both require a high-performance pipeline in which data and models are prepared, packaged, tested, and deployed into useful applications. Both are essential elements of enterprise convergence of data warehouse and data lake platforms into lakehouses. Both derive value from integration with the continuous integration and continuous deployment pipeline tooling often referred to as DevOps.
For years, TDWI research has tracked the platforms, tools, skills, techniques, and processes needed to operationalize data and analytics models in enterprise applications. This TDWI Best Practices Report discusses the current state of DataOps and MLOps practices, platforms, and pipelines in modern organizations. It describes key business drivers, implementation challenges, and key use cases for both DataOps and MLOps pipelines in their respective domains, and for unification of these pipelines within enterprise IT infrastructures. It also highlights the trend under which enterprises are unifying their DataOps and MLOps pipelines in order to improve how advanced analytics, artificial intelligence, and other intelligent applications are developed, tested, deployed, and optimized.
This study leverages findings from a survey of 204 data management and analytics professionals worldwide. It illustrates how the unification of DataOps and MLOps platforms, workflows, and methodologies is dovetailing with enterprise data modernization initiatives. Chief among these initiatives are the widespread migration of enterprise data analytics platforms to the cloud and consolidation of data warehousing and data lakes into next-generation architectures such as data lakehouses, data meshes, and virtualized data fabrics.
One interesting trend in this regard is the extent to which enterprises are relying on DevOps continuous integration/continuous deployment (CI/CD) tools as a key standardizing framework for harmonizing their disparate investments in DataOps and MLOps platforms and tools. Almost one-quarter of respondents (24%) stated that they are already using CI/CD tools to manage the orchestration of their enterprise data analytics production pipelines.
Incorta, Matillion, SAP, and Snowflake sponsored the research and writing of this report.
• Download the Full Report
James Kobielus is senior director of research for data management at TDWI. He is a veteran industry analyst, consultant, author, speaker, and blogger in analytics and data management. At TDWI he focuses on data management, artificial intelligence, and cloud computing. Previously, Kobielus held positions at Futurum Research, SiliconANGLEWikibon, Forrester Research, Current Analysis, and the Burton Group. He has also served as senior program director, product marketing for big data analytics for IBM, where he was both a subject matter expert and a strategist on thought leadership and content marketing programs targeted at the data science community.