Beyond Data Governance in the Digital Business
The traditional distinction between governance planning and implementation won't work for data-driven digital businesses.
- By Barry Devlin
- August 28, 2017
Data governance raises difficult questions within business and IT. Both parties affirm the importance of high quality data, but neither sees how to achieve it based on their current understanding of data governance. In many cases, the buck is passed, and overall progress is slow.
We need a new way of thinking about the topic if data-driven digital business is to succeed.
The Traditional Division: Oversight and Action
Data governance is a set of rules, processes, behaviors, and organizational structures and responsibilities for information creation and use. Their actual implementation in tools and technologies is usually called data management and considered a separate topic. In his book Data Governance, John Ladley maintains this split is necessary from organizational and process viewpoints because it separates oversight and action. Oversight is a business responsibility; action may be taken by business users and/or IT, supported by appropriate tools.
Accordingly, today business defines accountabilities and processes for ensuring quality data, often through the identification of data owners and appointment of data stewards. Given the complexity and disparity of much business data, IT has assumed the primary responsibility for hands-on management of data quality.
This strict division of labor is problematic in digital business. As I wrote in Business unintelligence, virtually all business innovation now arises from information and technology. Information -- and the technology to support its management and use -- is the business and vice versa. The boundaries between IT and business must disappear. I say use the word information (what the business needs), rather than data (what IT usually produces), to emphasize what digital business actually requires. Furthermore, a focus on information alone is too narrow. Despite a name that emphasizes data, digital business also demands fundamental process changes.
Modern Information Governance
Today it makes more sense to consider oversight and action as two ends of a gamut of processes needed to ensure data quality and information reliability. I'll use the term information and process governance (IPG) to cover this range.
With information front and center in every aspect of understanding, managing, and predicting business performance, IPG must become an active, everyday process for everyone who uses data. This implies that the discipline of good governance must stretch from data capture or sourcing through cleansing and information preparation, then to analysis, reporting, and decision making, and finally to actions.
In traditional data warehousing, IT assumes full responsibility for quality during sourcing and cleansing. The businesspeople responsible for decision making and taking action depend on that quality data from IT so they can apply business intelligence (BI) tools with confidence. However, data discovery -- perhaps better called information discovery -- tools encourage innovative businesspeople to get data wherever they can find it, often circumventing IT.
Addressing these issues requires embedding full-scope governance directly in the BI environment, where its use can be shared by business and IT staff, as discussed in my recent white paper. Such collaboration bridges from business discovery of quality needs and problems to IT implementation of management processes, based on a shared understanding of governance demands in the context of existing data. We see evidence of IPG emerging in BI tools, with Yellowfin's recent release, for example, specifically targeting the governance function.
Data lakes present a particular challenge for governance. Here, the capture and preparation of externally sourced data, often of dubious provenance, falls to data scientists. They often have minimal training and little interest in data governance.
The tools used -- such as Trifacta, Alation, and Waterline Data Science -- show short development histories and emerging function. In many cases, the same data scientists are also responsible for analysis and decision making. As in the case of data discovery, IT knowledge and expertise in data governance is sidestepped in this process. Again, more focus on process and collaboration is required.
As digital business becomes more prevalent, governance issues are certain to become more pressing. Past experience with traditional data governance suggests that process and information governance in this new environment will be challenging. A good place to start, however, is to explore how pervasive governance can be applied to the BI environment and the benefits it can bring to business and IT alike.
Dr. Barry Devlin defined the first data warehouse architecture in 1985 and is among the world’s foremost authorities on BI, big data, and beyond. His 2013 book, Business unIntelligence, offers a new architecture for modern information use and management.