Data Issues Take 2 Days On Average To Spot and Fix, Bigeye Survey Says
Bigeye’s State of Data Quality Report finds that more than half of the respondents have experienced five or more data issues over the last three months.
Note: TDWI’s editors carefully choose press releases related to the data and analytics industry. We have edited and/or condensed this release to highlight key information but make no claims as to its accuracy.
Bigeye, a data observability specialist, has released the results of its 2023 State of Data Quality survey. The report sheds light on the most pervasive problems in data quality today.
The report, researched and authored by Bigeye, consisted of answers from 100 survey respondents. At least 63 came from midsize-to-large cloud data warehouse customers (with spending of more than $500k per year) who have some form of data monitoring in place, whether third party or built in house.
First Line of Defense Against Data Issues
Bigeye’s survey found that data engineers are the first line of defense in managing data issues, followed closely by software engineers. The role of data engineer has moved on par with software engineering. Like software engineers, data engineers are in charge of a product -- the data product -- that increasingly demands software-like levels of process, maintenance, and code review.
Desire for Automation
Respondents who used third-party data monitoring solutions found about a two to three-times better ROI over in-house solutions. They also noted that at full utilization, third-party data monitoring solved for two issues: fractured infrastructure and anomalous data. They further reported that third-party data monitoring solutions had better test libraries and a broader perspective on data problems.
Data Incident Frequency
Research revealed that companies experience a median of five to ten data incidents over a period of three months. These incidents range from severe enough to impact the company's bottom line to reducing engineer productivity. These incidents take an average of 48 hours to troubleshoot.
Organizations with more than five data incidents a month are essentially lurching from incident to incident with little ability to trust data or invest in larger data infrastructure projects. They are largely performing reactive rather than proactive data quality work.
Other Important Survey Results
There were other interesting insights revealed by the survey, including:
- Respondents reported that it took them 37,500 person-hours to build an in-house data quality monitoring solution; this roughly equates to one year of work for 20 engineers
- Seven in ten respondents reported at least two data incidents that diminished the productivity of their teams
- Data issues most commonly take about1-2 days to spot and fix, but with a long tail lasting as long as weeks and months
- Respondents reported at least two “severe” data incidents in the last six months, which created damage to the business or its bottom line and were visible at the C-level
“Coming from a data team before starting Bigeye, I knew anecdotally how much of a burden data quality and pipeline reliability issues were. These survey results confirmed my experience: data quality issues are the biggest blockers preventing data teams from being successful,” said Kyle Kirwan, Bigeye’s CEO and co-founder. “We’ve heard that around 250-500 hours are lost every quarter just dealing with data pipeline issues.”
To read the full report, click here. No registration required.