Data Governance: Benefits and Best Practices
What can data governance do for your enterprise, and how can you improve your data governance program? Semarchy's Michael Hiskey offers some perspective.
- By James E. Powell
- October 8, 2018
In an age when regulations such as the GDPR impose ever-greater penalties, data governance's importance has never been higher. What benefits can you expect, and how do you get started?
For answers, we spoke to Michael Hiskey, head of strategy at Semarchy, which bills itself as "The intelligent data hub company." Hiskey thinks of his role as "chief product evangelist" and "storyteller of client success." Among the topics we discussed -- best practices for data governance.
Upside: From your perspective, what are the three key benefits of a data governance program?
Michael Hiskey: I'd put accessibility at the top of my list. In order to govern data, it needs to be brought into one place (either persisted in one place or virtually connected there). Bringing data from silos into a governed environment, which is part of master data management, makes it possible for it to be accessed easily by staff who can ascertain value from it.
Second, I would say security. Once you have data all in one place, it becomes a pressing concern to control who has access to it. Governing data makes it more secure and adds auditability.
Uniformity is the last key benefit. Data quality and enrichment should go hand-in-hand with governance, as does the normalization of reference data. Having uniform options in drop-downs, for example, makes analyzing, reporting, and decision making easier.
How does data governance tackle bad data?
That's part of the uniformity I mentioned. Duplicates, errors, and bad data are surfaced as part of the governance process. The addition of data lineage (where did this bad data come from?) and auditability (who made changes to this data/field and when, under what authority?) improve the quality and utility of the data in question.
What's been hindering enterprises from establishing a data governance program?
In short, business value. Data project holders want governance programs but often lack the ability to state in simple terms what value they will drive for the business. One of the hindering points is that effective governance can't be obtained without also tackling MDM, data quality and enrichment, and workflows. This is closely related to data catalog, metadata management, and business glossary efforts -- which also fall short.
Cobbling together many point solutions is rarely effective, and the project implementation timelines often delay ROI beyond the reasonable expectations of the business.
What are best practices you recommend to enterprises interested in implementing a data governance program or improving one already in place?
Effective data governance takes an efficient combination of people, process, and technology. Applying an agile development mindset (start small with a minimum viable solution, iterate, and grow from there) will yield greater long-term benefits, while bringing the rest of the organization "along for the ride."
This is the opposite of the old advice to "set up a data governance board." Instead, that body should grow naturally out of the initial iteration, with experts who are part of the solution weighing in and not taking ownership. This will yield better results than a dictatorial process.
What is the role master data management (MDM) plays in compliance preparations for regulatory mandates such as the GDPR?
MDM is the single most effective technology solution for addressing the GDPR. As I have said in recent articles, that "single view of a person" (customer, employee, supplier) is a human side of data. Correlating all of the data points you hold on every person with whom you interact is an MDM-led initiative that quickly employs governance and other capabilities to derive a meaningful way to access, rectify, port, and erase data as required by the new regulations.
It's also important to note that for GDPR, nontechnical people (such as data protection officers) will need to curate personal data -- so this can't be a database technical project.
How does strong data governance differentiate data lakes from swamps?
Data lake technology is becoming the infrastructure for storing, accessing, and retrieving large volumes of data. The conceptual retention of all data, without regard to how it will be used later, has led to widespread interest in data science to derive value. The requirement for those kinds of deeply educated experts is due to this massive disorganization of data.
Building a data hub (governance + MDM + quality + workflows) atop a data lake is an intelligent way to make it available to nontechnical users, make it interact with current business systems (that do not work and play well with such technology as Hadoop), and control the access and auditability of that data. These things working in concert prevent data swamps.
Where is data governance heading? What new benefits or challenges do you see on the horizon?
Governance has long been in the engine room of organizations. The GDPR brings it to the forefront for some, but the crossover will happen when regular business users -- nontechnical ones -- can be empowered to curate and manage data for which they have appropriate permissions. This will effectively push the role of data stewardship down into the business units where it can be more effective and useful.
James E. Powell is the editorial director of TDWI, including the Business Intelligence Journal and Upside newsletter.