TDWI Articles

Executive Q&A: Data Quality, Trust, and AI

John Nash, the chief marketing and strategy officer at Redpoint Global, explains how data quality, trustworthiness, and golden records are important to AI.

Upside: What are some of the consequences of ignoring data quality when using AI to generate a more personalized customer experience (CX)?

For Further Reading:

Q&A: Why Is Data Quality So Elusive?

Introducing the Data Trust Index: A New Tool to Drive Data Democratization

It’s All About Trust: The Importance of CX in 2020

John Nash: Any AI process -- analysis, training, prediction, recommendations, or content generation -- will only be as good as the data that fuels it. All of these processes will, to some degree, impact the delivery of a personalized CX. Ignoring data quality means that AI-driven decisions are made on an incomplete or inaccurate data set. From a practical standpoint, that translates to customer experience friction throughout a customer journey: irrelevant offers, lack of a cross-channel understanding, or a frustrating browsing session.

A personalized CX requires high-quality data that is cleansed, matched, merged, and always ready for business use. A unified customer profile must precisely represent a customer, household, or business entity and be continuously updated to reflect a customer’s engagement with a brand in real time. When data quality processes are complete as part of the profile creation, downstream AI processes will produce trusted outcomes. Using AI for audience segmentation, for example, will produce meaningful segments that reveal something important about a company’s customers that can improve predictions, recommendations, and content generation.

How is data quality related to data trustworthiness, and how can marketers trust AI outcomes (for example, that an AI-driven segment has trained on the right data set)?

High-quality data must underpin any AI process, but it is also true that AI must train on the right data set. As an example, consider a chatbot conversing with a native French speaker. A chatbot that recognizes and responds to the nuances of the dialect will provide a more personalized experience. However, what if the chatbot “learned” French by being trained on the French-Canadian dialect? Should a marketer in this situation trust the outcome? Similarly, if you’re using a LLM to ask AI to produce, say, a list of customers with a high propensity to churn -- are you confident that AI is training on the right algorithm?

Because no data is 100% perfect, marketers and business users who leverage AI must have transparency into decision-making. What is AI basing its decision or outcome on? Data observability is the process of interrogating data as it flows through a marketing stack -- including data used to drive an AI process. Data observability provides crucial visibility that helps users both interrogate data quality and understand the level of data quality prior to building an audience or executing a campaign. Data observability is traditionally done through visual tools such as charts, graphs, and Venn diagrams, but is itself becoming AI-driven, with some marketers using natural language processing and LLMs to directly interrogate the data used to fuel AI processes.

Beyond data observability, trust in outcomes is also bolstered by ensuring that AI integrates with the marketing stack in a way that sets the business up for success according to the intended use cases.. To best optimize AI in a marketing stack, organizations should have the infrastructure flexibility to connect to best-in-breed third-party applications for segmentation, journey optimization, modeling, analytics and generative AI innovations.

Just as trust is diminished without data quality, trust is diminished if a packaged marketing technology in some way prohibits use of innovative external AI capabilities. With data observability capabilities included in marketing technology that provide a single source of truth for customer data and open AI integrations, marketers can trust AI outcomes across the entire marketing infrastructure.

Given the importance of data quality for AI, what best practices can you recommend to ensure your processes actually improve the data quality in your organization? How do you then instill trust in AI-generated data in your marketing staff as data flows through your organization?

One best practice is to ensure data quality processes are complete at the point of ingestion into a customer data platform (CDP) or another marketing technology, particularly a solution that builds a unified profile and is the single source of truth for customer data. Too many organizations put off data quality until data extract, at which point it’s too late -- leading to poor match quality where the right data is not attached or linked to the right customer.

Best practices for data quality at ingestion include a combination of deterministic and probabilistic matching (i.e., both human-guided and rules-based), data stewardship and data lineage processes to correct faulty data, and data quality at different levels (individual, household, and business). When high-quality data fuels AI processes and produces better outcomes, those outcomes will produce a new trove of first-party data that can then be fed back into a CDP, creating a closed-loop AI cycle where better data produces better outcomes, leading to even better data.

How is trustworthiness of the marketing staff impacted when an organization works with siloed data or processes? What kind of inefficiencies are introduced to AI when you use data silos? What data hygiene approach can you recommend when using AI across multiple systems to guard against these inefficiencies?

In a way, data silos are as much a source of great distress to AI as they are to the customer experience itself. A marketer might, for example, use a LLM to help generate amazing email subject lines, but if AI generates those subject lines knowing only what is happening in that one channel, it is limited by not having a 360-degree view of the customer. Each system might have its own concept of a customer’s identity by virtue of collecting, storing, and using different customer signals. When siloed data is updated on different cycles, marketers lose the ability to engage with a customer in the precise cadence of the customer because the silos are out of synch with a customer journey.

Buy online, pick-up in-store (BOPIS) is a good example of a use case that is hard, if not impossible, to pull off without understanding a customer’s real-time cadence. The way to guard against these inefficiencies is to create a golden record for a customer -- a unified customer profile that contains an identity graph and a contact graph with all behavioral and transactional data. Updated in real time and accessible across the enterprise, a golden record reflects a current understanding of a customer across all channels -- matching the cadence of a customer journey.

Using AI in concert with a golden record cements trust in the outcomes, both because a golden record only includes data that has been cleansed and made ready for business use, and because a golden record will brook no data silos; it will include everything there is to know about a customer across all interaction and engagement touchpoints.

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.