TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Think
- Research & Resources
  - TDWI Playbook | Next Generation Data Science: The AI-Driven Data Science Life Cycle
  - TDWI Data Points | The Data Foundation for AI
  - TDWI Best Practices Report | Data Strategies and Foundations for Modern Data Management
  - TDWI Insight Accelerator | Adopting a Platform Approach for Gaining Insights from Unstructured Data
- Webinars
  - Expert Panel: What's Next in Data Integration: Powering the AI-Driven Enterprise August 25, 2025
  - Architecting a Modern Martech Stack for Speed, Scale, and AI Readiness August 26, 2025
  - Expert Panel: Improving Data Quality, Accuracy, and Consistency August 27, 2025
  - The State of Self-Service Analytics: Results from TDWI’s Latest Research September 8, 2025
- Virtual Summits
  - Virtual Events Keys to Making Your Data AI Ready September 10, 2025
  - Virtual Events Data Quality for BI, Analytics and AI October 22, 2025
  - Virtual Events Modern Data Strategy November 12, 2025
  - Virtual Events What’s Ahead in 2026 for Data & Analytics December 10, 2025
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Speaking of Data Podcast
  
  Current Research Surveys
Train
- In-Person Events
  - Conference TDWI Transform 2025 San Diego August 18, 2025
  - Executive Summit TDWI Modern Data Leader's Summit San Diego: AI in the Enterprise August 18, 2025
  - Conference TDWI Transform 2025 Orlando November 16, 2025
  - Executive Summit TDWI Data & AI Leaders Summit Orlando: Governing Data, Analytics, and AI November 17, 2025
- Virtual Live Seminars
  - Data Governance Week July 30, 2025
  - Platforms & Architecture Week July 30, 2025
  - AI Bootcamp Week July 30, 2025
- Online Learning
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Train Your TeamCustom solutions for training your team
  
  Get CertifiedEarn a professional credential in BI and Analytics, Data Governance, or AI
  
  TDWI MembershipExclusive access to the research, tools, training, and connections
Engage
- Connect
  - Connect and Contribute to Our Vibrant Community of Data Leaders
    
    Subscribe to TDWI Stay up to date on the latest news and events. Sign Up
    
    Become a TDWI Member Gain exclusive access to the research, tools, training, and connections to move your careers, teams, and projects forward. Learn More
    
    Become a Part of the TDWI Research Panel Make a difference in the data and analytics industry and earn incentives by sharing your insights with TDWI. Explore Now
    
    Speak at TDWI Events Share your expertise and build your personal brand as a speaker at a TDWI In-Person or Virtual Event. Submit a Proposal
    
    Become a TDWI Research Fellow Apply to be a member of TDWI’s industry leading research team. Apply Today
    
    Become a Member of the Data & AI Leaders Forum Engage in collaborative discussions, stay ahead of the curve, and stay in the know. Apply Now
    
    Showcase Your Data & AI Solutions Reach and engage with TDWI community through multi-channel marketing programs. Learn More

TDWI Articles

Six Validation Techniques to Improve Your Data Quality

Boost the data quality of your data warehouse with six practical techniques.

By Wes Flores
February 19, 2016

Have you ever had a set of reports that were distributed for years only to have your business users discover that the reports have been wrong all along and consequently lose trust in your data warehouse environment? Gaining trust is the foundation of user adoption and business value of your data management program.

Taking data quality seriously can be difficult if agility and speed-to-market are the name of the game for your business users. This is a lesson that is very costly to learn the hard way. How do you prevent these issues from occurring? After all, it's not like you didn't have validation checks as part of your standard process.

Full data-quality frameworks can be time-consuming and costly to establish. The costs are lower if you institute your data quality steps upfront in your original design process, but it is a valuable exercise to review and overhaul your data quality practices if you only have basic checks in place today.

Here are a few data validation techniques that may be missing in your environment.

Source system loop back verification: In this technique, you perform aggregate-based verifications of your subject areas and ensure it matches the originating data source. For example, if you are pulling information from a billing system, you can take total billing for a single day and ensure totals match on the data warehouse as well. Although this seems like an easy validation to perform, enterprises don't often use it. The technique can be one of the best ways to ensure completeness of data.

Ongoing source-to-source verification: What happens if a data integrity issue from your source system gets carried over to your data warehouse? Ultimately, you can expect that the data warehouse will share in the negative exposure of that issue, even if the data warehouse was not the cause. One way you can help catch these problems before they fester is to have an approximate verification across multiple source systems or compare similar information at different stages of your business life cycle. This can be a meaningful way to catch non data warehouse issues and protect the integrity of the information you send from the data warehouse.

Data-Issue tracking: By centrally tracking all of your issues in one place, you can find recurring issues, reveal riskier subject areas, and help ensure proper preventive measures have been applied. Making it easy for business users and IT to input issues and report on them is required for effective tracking.

Data certification: Performing up-front data validation before you add it to your data warehouse, including the use of data profiling tools, is a very important technique. It can add noticeable time to integrate new data sources into your data warehouse, but the long-term benefits of this step greatly enhance the value of the data warehouse and trust in your information.

Statistics collection: Ensuring you maintain statistics for the full life cycle of your data will arm you with meaningful information to create alarms for unexpected results. Whether you have an in-house-developed statistics collection process or you rely upon metadata captured with your transformation program, you need to ensure you can set alarms based upon trending. For example, if your loads are typically a specific size day in and day out and suddenly the volume shrinks in half, this should set off an alert and a full investigation should occur for such events. This situation may not trigger your typical checks, so it's a great way to find those difficult-to-catch situations on an automated basis.

Workflow management: Thinking properly about data quality while you design your data integration flows and overall workflows can allow for catching issues quickly and efficiently. Performing checks along the way gives you more advanced options to resolve the issue quickly. One example of this is to have strong stop and restart processes built into your workflow so that as an issue is found in the loading process, it can trigger a restart and determine if the issue was environment based. Enabling autocorrection of common challenges related to performance and environmental factors. It also can enable data quality checks that are only possible at the sub-task level during the processing window.

A Final Word

Although there may not be any truly foolproof way to prevent all data integrity issues, making data quality part of the DNA of your environment will go a long way in gaining the trust of your user community.

About the Author

Wes Flores has 25 years of experience in the technology and analytics space making an impact as a senior IT executive with Fortune 100 and SMB companies alike. As a strategist and solution architect, Wes brings a business-first mindset to technology solutions and approaches for his clients. His experience spans private and public sectors across many industries such as telecom, healthcare, insurance, finance, hospitality, and higher education. Wes has been honored with numerous industry awards including Top 25 Information Managers from Information Magazine and the BI Perspectives award from Computer World for Business Intelligence Solutions. Most recently his company, of which he is a co-founder, was recognized as a top 10 analytics solutions provider by CIO Applications magazine. Wes shares his expertise gained in advisory and program leadership roles through industry blogs and international speaking engagements. You can contact the author via email.

In-Depth Training on Data Quality

TDWI offers industry-leading education on best practices for data quality. Check out upcoming conferences and seminars to find full-day and half-day courses taught by experts. Save 30% on your first event with code 30Upside!

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.

↑

TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Six Validation Techniques to Improve Your Data Quality

In-Depth Training on Data Quality

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI

Engage

Research

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Six Validation Techniques to Improve Your Data Quality

In-Depth Training on Data Quality

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects, and Your Career

TDWI

Engage

Research

Accelerate Your Projects,
and Your Career