December 16, 2010
The Six Cs of Trusted Data
Philip Russom, Senior Manager of Research, TDWI
Topic: Data integration in support of business intelligence
In my last column for TDWI Experts in BI, I defined "trusted data" as:
Data that is drawn from carefully selected sources, transformed in accordance with data's intended use, and delivered in formats and time frames that are appropriate to specific consumers of reports and other manifestations of data.
These and other data properties assure that data is trustworthy from a technical viewpoint, as well as trusted by users who consume the data through reports and applications. Trust is important for data. Without trust, users may ignore supplied data and build their own data stores. Data in poor condition can lead to poor decisions.
In this column, I will drill deeper into how to achieve trusted data. A base assumption is that the problems resulting from non-trusted data can be avoided by following modern best practices in data integration, plus related disciplines such as data quality, data profiling, master data management, and metadata management.
The mere presence of these isn't enough, of course. Solutions created with data management techniques – in order to produce trusted data – must focus on the data properties that are key to trust. In short, the data must be complete, current, consistent, clean, compliant, and collaborative.
As luck would have it, the six data properties that are key to trust all start with the letter C, which is why I call them "the six Cs of trusted data." Let's take a look at how each contributes to trusted data.
Complete Data. This results from data integration techniques that produce consolidated data structures. For example, an enterprise data warehouse fosters trust, as the single -- and complete -- version of the truth for decision making. Likewise, the EDW provides a historic context for real time data, and 360-degree views of customers give users confidence that they really know the customer.
Current Data. A common question in BI is: "How old is the data in this report?" With time- sensitive practices such as operational BI, fresh data is considered trustworthy, whereas stale data isn't. As data is delivered faster and more frequently (leaving little time for data preparation), delivering current data that's also consistent, clean, and compliant is a challenge.
Consistent Data. Consistency stems from consistently applying definitions of business entities, such as customers, products, and finances. Metadata management and master data management can improve consistency by documenting data's origins and meanings. Without consistency, users don't trust that data was sourced or aggregated properly, especially when data travels across multiple IT systems.
Clean Data. This is typically the result of data quality techniques, such as standardization, verification, matching, and de-duplication. Users' perceptions of data's quality is probably the biggest challenge to trust, which is why data quality techniques are critical. Quality decisions and operational excellence both depend on clean data.
Compliant Data. Compliance regulations come from many sources. Some are external to your enterprise, including federal legislation and your partners. Others are internal, including your own standards for data architecture, quality, security, and privacy. Technical and business people need to trust that data has been accessed and distributed in accordance with multiple internal and external regulations. Achieving this level of trust may require a data governance board or similar organizational body.
Collaborative Data. First and foremost, collaboration over data helps ensure that data management and business management goals are aligned. Cross-functional collaboration improves trust in cross-departmental data sharing. Collaboration can drive consensus in how business entities are defined in data. In a lot of ways, the first 5 Cs are data properties, whereas the sixth one -- collaboration -- reaches across and unifies the other Cs. Collaboration is the "secret sauce" that adds trust to an EDW's complete data, operational BI's current data, data quality's clean data, and data governance's compliant data.
A Measuring Stick
In summary, let the six Cs of trusted data guide you. They are a measuring stick for both technical and business people, defining goals that data management staff must strive toward continuously. Yet, the six Cs define business users' requirements, too; satisfy these and data management solutions will be considered successful at delivering trustworthy data that users can feel confident about using.
Philip Russom is senior manager of TDWI Research. Philip can be reached at firstname.lastname@example.org .
Vendor Q & A - Alan Winters, Director of product management at Corda
Q: What's the best way to demonstrate ROI from a BI investment?
A: There's no one "best way" to show the value of BI because the objectives for each deployment can vary so widely. However, one approach that I've seen prove effective for a growing number of clients is to calculate the cost of manual reporting processes. This is often an ancillary measure if a more direct measure isn't readily available, but the key is that this measure is readily available with most projects.
For instance, one of our large enterprise customers was spending one day each week gathering and reporting sales and customer data for each of its product lines. The unfortunate recipient of that task had to pull information from Salesforce, Teradata, spreadsheets, and other data sources to compile a weekly report. That process alone required more than 400 hours of manual effort each year for every product line. After implementing an enterprise dashboard on top of the data sources, the company effectively eliminated and repurposed all of that manual effort by automating the reporting process.
Taken a step further, eliminating such a time-consuming manual process equates to tens of thousands of dollars in employee time each year, multiplied by dozens of different product lines. Very quickly you begin to see a significant cost savings -- either by reducing head count or by delaying the addition of resources.
For more on this topic, check out this on-demand webinar, created by Gartner and Corda that highlights other best practices for building the value of BI.
Copyright 2010. TDWI. All rights reserved.