Evolution of Data Governance with Eric Falthzik
Eric Falthzik, strategy principal director with Accenture, discusses the evolution of data governance -- including the role of data stewards, how to govern new implementations, and the role of generative AI.
- By Upside Staff
- January 26, 2024
In this recent “Speaking of Data” podcast, Eric Falthzik discusses the evolution of data governance from centralized gatekeeping to federated enablement. Falthzik is strategy principal director with Accenture and a speaker at the upcoming TDWI Modern Data Leaders Summit. [Editor’s note: Speaker quotations have been edited for length and clarity.]
“Data governance has changed dramatically over the years,” Falthzik began. “Until recently, governance was mostly focused on the ‘people’ part of the ‘people, process, and technology’ triad. There were governance councils and data stewards who were spending all their time in meetings trying to create policies that they would then enforce on employees trying to use data for their work.”
Falthzik explained that although those policies and guardrails are still important, business now moves too quickly to allow for such a slow-moving process. Workers need self-service access to data and analytics to remain competitive in the future.
He added, “Enabling self-service involves some new areas of governance -- for example, pursuing active metadata management and being more diligent data quality. We also need to discuss how we’re going to go forward in a world of data products and AI.”
Another key component of modern data governance Falthzik recommends is implementing a federated architecture. “A centralized environment is part of the old-school process of a small group maintaining tight control over data; it won’t work in a self-service environment,” he said. “Business workers want to feel some sense of involvement in the process of governing the data they use daily. Additionally, some new concepts such as the data mesh recommend that the data domains be given far more autonomy, which can’t be done in a centralized environment.”
He also noted that an added benefit of assigning more data operations to the business is that it will help identify those who would make the best data stewards.
“The best data stewards are the ones feeling the most pain,” he explained. “Whether they’re struggling with data quality or with trying to enable their data analysis, they’re the ones with the most immediate struggles, and therefore the most motivated to overcome them.”
The introduction of productized data has increased the importance of having data governance embedded within the business, Falthzik continued. “It’s critical that the data steward be able to work closely with the data product manager to ensure the data going into the product is high quality and fit for purpose.”
One problem raised itself in the conversation, though -- if so many organizations place data governance as a top priority (which TDWI research says they do), why do so many of them wait until after implementing a new platform to worry about data governance?
Falthzik likened it to being in the batter’s box at a baseball game.
“Why would I want to swing at a pitch after it’s already in the catcher’s mitt?” he said. “If we’re implementing a large project, why do we want to put in all the effort of governing our data after the solution has been built? Data governance needs to be part of the process. It’s much better to populate your data lake with high-quality data, for example, than to try to clean it up once it’s in there.”
Falthzik also strongly emphasized the importance of using only similarly high-quality data in any generative AI initiative your company may undertake. “If poor quality data gets into your ecosystem, it will burrow in deeper and deeper, leading to some potentially very bad things,” he said.
“Another area that’s vitally important is query curation,” he continued. “In the old days, when I was writing query code, the concern was about breaking the database. Now, the problem isn’t that the database will break but that it will give you a bad answer -- one that’s wrong or made up -- which is a bit of a more challenging issue.” Teaching users to write queries in such a way as to not only avoid bad answers but to also generate the best ones will be an ongoing task, he said.
[Editor’s note: Falthzik’s presentation, “Expert Best Practices: The Evolution of the Data Governance Operating Model in the Era of Data Products and Generative AI” is part of the Modern Data Leaders Summit in Las Vegas (February 19-21, 2024). See the full agenda.]