TDWI FlashPoint Newsletter TDWI FlashPoint Newsletter
RELATED BI RESOURCES
EDUCATION

TDWI Data Warehousing Concepts and Principles: An Introduction to the Field of Data Warehousing
ORLANDO WORLD CONFERENCE

The Future of Data Warehousing
ORLANDO WORLD CONFERENCE

TDWI Business Intelligence Fundamentals: From Data Warehousing to Business Impact
ORLANDO WORLD CONFERENCE

WHITE PAPERS

The New Era of Mobile Intelligence: The Convergence of Mobile Computing and Business Intelligence


7 Leading Applications of Business Intelligence Software, Analytic Application Development Tools
WEBINARS
Data Quality White 

Paper Operationalizing Information Governance for Business Process Success
Presented by David Loshin
Event date: September 20, 2011

Data Quality White 
Paper Managing the Risks of Offshore Data Warehousing
Presented by Krish Krishnan
Event date: September 27, 2011
ABOUT TDWI EXPERTS

TDWI Experts is a twice-monthly e-newsletter where BI/DW thought leaders share opinions and commentary about relevant industry topics and the latest technologies.

Article Image
Feature

September 8, 2011

Data Quality: Garbage In Still Produces Garbage Out

Michael A. Schiff
Principal Consultant, MAS Strategies

Topic: Experts in Data Warehousing

Operational systems, third-party data providers, Web logs, social media sites, instant messages, emails, blogs, call center interactions, census information, and competitive intelligence are just some of the data sources that organizations may use in their analyses. Since most analyses are done to facilitate better decisions, it follows that these decisions are only as good as the source data upon which they are based. The computer industry describes this as "garbage in, garbage out" or GIGO; the concept is ancient and universal as, for example, incorrect data about enemy troop locations, poisonous verses non-poisonous plants, medical symptoms, and real estate values can lead to disastrous decisions with or without the use of any computer technology.

We now have a wide variety of delivery vehicles and display mechanisms including printed reports, dashboards, spreadsheets and pivot tables, advanced visualization and auditory tools, and smartphone apps. Although these vehicles may make it far easier to spot trends and quickly alert us to exceptional conditions, the analyses they deliver are only as valid as the validity of the underlying data.

For data to be valid and of high quality, it must be consistent, accurate, timely, and complete. We owe it to our organizations to do all we can do to make this possible. Analyses involving low-quality data are almost certain to produce low-quality results.

Among the many benefits of a data warehouse is that it serves as a control point for enhancing data quality and ensuring the enforcement of enterprise data standards. It facilitates an organization's ability to achieve a "single version of the truth" across the enterprise, while consolidating data from multiple, often heterogeneous, sources.

This is one of the reasons that I have always been concerned with enterprise information integration (EII) or data virtualization implementations that run directly against operational systems without the discipline imposed by first cleansing, standardizing, transforming, and reformatting data when loading it into a data warehouse.

As a very simple example, consider a data warehouse that contains customer sales data from multiple business units, some of which are in the United States and some of which are in Europe. It is quite likely that each business unit records sales in its local currency (e.g., U.S. dollars and Euros) and when this data is loaded into the data warehouse it needs to be converted into a standard currency. Otherwise, the organization would be "adding apples and oranges."

Sponsored Links

Quality issues also arise when an organization's business units each assign their own customer (or vendor) numbers. Unless each business unit's customer numbers were converted to the "corporate" customer numbers, queries such as "who are our top 10 customers in terms of 2010 sales" would yield incorrect results. Although these data-quality issues are among the simplest to resolve, failure to do so will lead to very poor decisions.

With the advent of self-service business intelligence and the concept of pervasive BI moving towards reality, user dependence on IT has been reduced. In general, this is a step forward, but it also serves to remove any special knowledge that IT may have about the eccentricities in the data; potential problems that a business user might not be aware of.

Furthermore, many operational systems are also self- service; clients ordering products or booking reservations over the Internet, and these end users may introduce source data errors that a generation ago would have been caught by data entry clerks.

One of the best ways to ensure quality data is to capture and correct errors at their source, and consumer self-service has served to weaken this safeguard. Any organization desiring to improve its data quality would do well to closely monitor consumer data entry with edits that, for example, verify address fields and catch fictitious customer telephone numbers such as xxx-555-1212 (the telephone number for area code xxx directory assistance).

Data quality extends far beyond our data warehousing efforts. Data quality is a fundamental component of any compliance reporting application, master data initiative, and operational system, as well as any business-to- business application among an organization, its vendors, and its customers. Most organizations consider their data to be a major corporate asset; if they truly believe this, they need to ensure that it is of the highest quality and recognize that garbage in will almost always generate garbage out. They also need to recognize that data quality is not a one-time exercise; rather, it is an ongoing process that must be continually monitored and maintained.

Michael A. Schiff is a principal consultant for MAS Strategies. He can be reached at mschiff@mas- strategies.com
.

Copyright 2011. TDWI. All rights reserved.



TDWI Membership TDWI Membership TDWI Membership This message has been sent to: [-EMAILADDR-]
TDWI will periodically send you information via e-mail about related products and services. If you do not wish to receive these types of e-mails, use our preference page:
https://newsletters.1105pubs.com/nl/BTGf.do?e=[-EMAILADDR-]

To view our privacy policy, visit: http://www.1105media.com/privacy.html

TDWI, 1201 Monster Road SW, Suite 250, Renton, WA 98057 TDWI Info