TDWI Upside - Where Data Means Business

Accessible Data Preparation: 6 Data Quality Tips

These six suggestions will help you improve your data prep and governance work.

Today's world is split-second. Whether you are pitching an idea, selling a product, presenting last quarter's results, or delivering the news, if you don't grab your audience's attention in the first few moments, you lose.

For Further Reading:

Q&A: An Introduction to Self-Service Data Prep

Avoid Data Governance Failure

Your Choices for Data Governance Are Growing

Under this "fast data" mentality, our focus on real-time pictures, graphics, and sound bites doesn't allow us the time or the tools to step back and validate the facts behind the words, dashboards, and striking visualizations. People with their hands in the data know the serious dangers this situation creates. They know how frequently data errors slip through the cracks and find their way into reporting, leading to incorrect conclusions and/or decisions -- especially when the data comes from manual processes.

Fortunately, attention to data governance and proper data preparation has been on the rise in thought leadership and media, and many tools and solutions are available to help organizations do a better job of data cleansing and preparation for analysis and reporting. However, even though we are getting better at automating the work, data preparation and cleansing are still un-sexy, onerous, time-consuming, never-ending tasks.

Even with the automation of many of these processes, human intervention is still critical. At a minimum, someone who knows the data needs to clarify the assumptions and create the rules and priorities that an automated data cleansing or data preparation tool needs.

Here are six suggestions to help you improve your data preparation and governance.

#1: Budget resources for data preparation

Recognize that ensuring data accuracy is critical to your business. Stunning visualization tools and other methods to analyze and report data are useful but irrelevant if your data is not accurate. It is essential to provide adequate resources -- time, personnel, and data prep tools -- to clean up your data prior to analysis and reporting.

#2: Focus solutions at the right level

Data quality is a huge, daunting, and ongoing task. You're not going to address it all with one approach. Often enterprise-level solutions do not satisfy the requirements of your end users. User-friendly business solutions may not scale to your enterprise. That does not mean that solutions are in conflict. Tactical solutions for the business can still fit into your enterprise strategy. Don't let the size and scope of an enterprise solution stop you from implementing necessary tactical data quality solutions for one or more business units.

#3: Employ accessible tools

Who on your staff will be doing the actual "hands-in-the-data" work or reporting? What level of technical proficiency is required to use the data prep tool(s) you select? Find a tool or set of tools that is easy to install, access, learn, and use. Otherwise, data users will find creative ways to avoid the new tools and processes and instead default to tools they already know, even if they are slower, manual, or more error-prone.

#4: Be transparent

Make sure you have appropriate governance, transparency, and auditability in your tool(s) and processes. At any time, you (or your data prep tools) need to be able to answer, with exact precision: "What, exactly, is this number and how did it get here?"

#5: Remember data governance and security

This should be an obvious priority. In these days of shared servers, collaborative processing, and cloud computing, your data may be more vulnerable to security and data integrity problems than ever before. You must take into consideration access to your data as well as quality assurance and change management of your data.

#6: Calculate costs and ROI

On the extreme end, the value of having accurate data in your reporting is most obvious when you face the fallout after presenting inaccurate data. That's a difficult ROI to calculate. Easier and more palatable measures of ROI are the amount of time saved or the number and types of data errors found and fixed using a tool to simplify or automate data cleaning and data prep.

Ask your vendor to work with you on an ROI trial so you can compare the cost and time of doing a task with your current toolkit or manual processes with the cost/time of doing it with the tool(s) you are considering. Don't forget to include an "accessibility" factor into your ROI (see suggestion #3). After all, your people will be the ones executing these operations.

About the Author

David Lefkowich is the chief marketing and business development officer for FreeSight Software, a data integration, preparation, cleansing, analysis, and reporting tool.


TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, & Team memberships available.