Monte Carlo’s Circuit Breakers Helps Data Teams Automatically Stop Broken Data Pipelines
The data observability platform’s new functionalities stop broken data pipelines before bad data impacts the business.
Note: TDWI’s editors carefully choose vendor-issued press releases about new or upgraded products and services. We have edited and/or condensed this release to highlight key features but make no claims as to the accuracy of the vendor's statements.
Monte Carlo, provider of data reliability tools and platforms, has released a new suite of data observability capabilities to help data teams automatically stop broken data pipelines before they impact the business.
Data engineers spend upwards of 30 percent of their time tackling data downtime, meaning periods of time when data is missing, erroneous, or otherwise inaccurate. These issues cost companies millions of dollars per year, eroding trust in the data that informs decision-making and powers digital services.
Monte Carlo’s Circuit Breakers automates testing and manages all quality checks across the entire data pipeline within a single interface. Circuit Breakers limits the downstream impact of bad data by stopping data quality issues closer to the source. With this release, Monte Carlo delivers both automatic and custom orchestration-based tests to monitor for data quality issues.
A Better Way Forward
Traditionally, data teams use manual tests to ensure that poor quality data doesn’t get passed downstream to stakeholders before going to production systems. To troubleshoot, teams must toggle between dozens of tools and platforms, slowing down the root cause analysis (RCA) process and making it difficult to develop a central source of truth about their data.
By stopping data processing jobs when data quality rules fail, Monte Carlo’s Circuit Breakers reduces time to detection and resolution of data issues, ensuring that teams avoid backfilling costs associated with cascading data failures. Simultaneously, Apache Airflow logs, data build tool (dbt) models, and other metadata related to an incident are made available in Monte Carlo, alongside additional root cause analysis capabilities.
In addition to helping teams avoid backfilling costs and improve engineering efficiency, Circuit Breakers allows data teams to:
- Deploy data reliability checks directly into data pipelines. For the first time, data teams can integrate Monte Carlo’s data quality tests directly into the orchestration layer, helping identify and stop bad data at its source.
- Manage rules and tests within a single platform. Data engineering teams can create and manage unscheduled data quality checks with Monte Carlo. Updates to the rules are automatically reflected in Airflow DAGs and other data processing jobs.
- More easily triage data quality issues. Once a job is stopped, Monte Carlo alerts data engineering teams via Slack, Microsoft Teams, email, webhooks, and other communication channels so they can prioritize associated issues and reduce time to resolution.
- Prevent broken dashboards and reports downstream. Circuit Breakers that fail rules automatically stop the data processing job, preventing tables, downstream dashboards, and reports from being populated with bad data. As a result, negative business impact from decisions made with inaccurate, stale, or duplicate data is eliminated.
- Set both automatic and custom rules for circuit breakers. Monte Carlo automatically generates circuit breakers based on common data quality checks, including null values, freshness, and distribution, with the option to set custom rules and alerts based on the needs of your business.
With Circuit Breakers, data teams can more easily achieve data quality KPIs and SLAs, reducing time to detection (TTD) and time to resolution (TTR) for data issues from hours or days to minutes. By automating circuit breakers, data engineering teams can better scale data reliability across their entire data stack without needing to update code each time a change is required.
More information is available at https://www.montecarlodata.com/.