6 Tips for Applying Data Quality Practices to Data Governance Programs
Apply to data governance much of what you already know about data quality.
- By Philip Russom, Ph.D.
- July 31, 2012
The barriers to data governance are erased when an organization adopts the techniques and best practices of data quality and the closely related practice of data stewardship. That's because the business-to-IT collaboration established by quality and stewardship practices is also required of data governance. In fact, quality and governance practices are similar, except that the needs of governance are broader, encompassing both enterprise data standards and business issues relative to data, such as compliance, risk, and privacy.
Instead of re-inventing the wheel, user organizations should apply to data governance some of the organizational structures and processes they learned from data quality programs. This minimizes the risks and decreases the time-to-use of data governance. Likewise, there are data quality tool capabilities that can help document, automate, and scale up data governance processes, as described in the following six tips.
Tip #1: Profile data early and often as you govern the data
Governing data effectively is challenging when you don't know the current state of data and its usage. This is why data profiling is quickly becoming an essential technique for data governance, just as it has become indispensible to data quality and data integration disciplines. Understanding enterprise data through profiling is a foundation for deciding which data needs governance, especially governance in the sense of establishing and enforcing data standards for data's quality, models, architecture, metadata, interfaces, lineage, and usage rules.
Tip #2: Build a business glossary as you govern data
Most business glossaries are extensions of metadata management, so it's natural that most glossaries manage metadata (and related semantics, such as master and reference data). This becomes an inventory of data for specific applications, such as data quality and data integration solutions. The glossary can, likewise, inventory data to be governed. This is a common first step for a new data governance program, and the inventory must be revised as the program grows.
Tip #3: Extend data quality metrics to measure the governance of data
Data stewards and others involved in a data quality program regularly capture metrics about the state of data's quality, then study these in reports and analyses to assure the continuous improvement of data. Many data governance programs enforce policies about data standards, and data quality metrics make a good starting point for such policies.
Tip #4: Remediate data that's out of compliance
Data metrics are very effective at revealing data that's out of compliance, whether compliance means technical standards for the structure and quality of data or business policies for data usage. As these techniques are applied at run time, they regularly find non-compliant data that must be remediated. Luckily, leading vendor tools for data quality, data integration, and other data management disciplines now support new automation for exception handling and other remediation tasks. These capabilities were originally designed for data stewardship, but they can also contribute to data governance.
Tip #5: Govern data in real time via validation and verification
We think of these as data quality tasks, but they also contribute to governance. After all, business rules and standards for data "govern" how data is to be validated and verified; more of these kinds of policies are mandated by a governance committee.
Tip #6: Use stewardship techniques to align data governance with business goals
For many organizations, data governance is an extension of a pre-existing data stewardship program that originated to ensure success with a data quality program. Many firms begin with a stewardship program and broaden it to cover data governance, such that data governance 1.0 is really data stewardship 2.0. This makes sense when data governance needs a heavy orientation toward data standards.
For more information, read the TDWI Checklist Report Using Data Quality to Start and Sustain Data Governance.
Philip Russom, Ph.D., is senior director of TDWI Research for data management and is a well-known figure in data warehousing, integration, and quality, having published over 550 research reports, magazine articles, opinion columns, and speeches over a 20-year period. Before joining TDWI in 2005, Russom was an industry analyst covering data management at Forrester Research and Giga Information Group. He also ran his own business as an independent industry analyst and consultant, was a contributing editor with leading IT magazines, and a product manager at database vendors. His Ph.D. is from Yale. You can reach him by email (email@example.com), on Twitter (twitter.com/prussom), and on LinkedIn (linkedin.com/in/philiprussom).