Skip to main content

Q&A: Master Data Management

Master data management expert David Loshin explains common MDM misconceptions, why MDM can be such a struggle, and the keys to having a successful MDM project.

David Loshin is an expert in master data management (among many other data-related topics) and is author of the book, "Master Data Management." His popular training course, Modernizing Master Data Management, is currently available from TDWI.org.

We recently got the chance to ask him a few questions about master data management; here's what he shared with us:

Q.: Why do you think most MDM initiatives fail even after multiple re-launches?

D.L.: Data professionals sometimes have a blind spot when it comes to understanding the organizational needs for producing high-quality, consistent master data. Many practitioners believe that there truly is a “golden record” for each master entity (such as a customer or a product) and that there is a process of absorbing records from across a variety of source systems and merging those records into this (I would call it mythical) “golden record.” Yet by virtue of the decision processes for determining what source data goes into that golden record, the result is yet another version of the entity data that is inconsistent with the other records across the enterprise landscape. 

A technology-driven MDM program that begins with this assumption is not going to be positioned for success, and repeated iterations of MDM projects that neglect to account for this flawed assumption will end up with similar results. 

What do executives need to understand before greenlighting an MDM program?

First, executives need to internalize the full complexity. Enterprise-wide integration is necessary at the semantic level to empower all the master data users to leverage master data in the ways that suit their needs. Second, I would say that driving MDM as a technology acquisition project is always going to be doomed to eventual failure.

Focus on the end goals that are aligned with corporate strategy: Identify which business processes are impeded by that lack of visibility into a complete view of information about unique entities, then evaluate what measurable improvements can be achieved by providing that visibility. Then consider the aggregate value of those improvements to prioritize and frame how the MDM program is to be designed, implemented, and deployed.

Can you explain identity resolution in simple terms—why is it so critical and where do most companies struggle with it?

We live in an imperfect world, and it is not unusual for a data set to contain multiple data instances that have varying data yet refer to the same real-world entity. At the same time, there may be apparent duplicate information between two records, yet those records really refer to different entities. I always try to boil it down to this: Identity resolution is all about balancing unification and differentiation. Namely, unifying a representation of the same entity that is represented more than once using different attribute values, versus differentiating distinct entities that are represented using the same or similar attribute values. 

For example, my name is represented in various places as “David Loshin” and “Howard David Loshin” since I go by my middle name, but some databases require you to provide your legal name. So a data set that has two records with each of those names in it is going to have a duplicate record for me. However, I know that there are at least 2 or 3 other people in the U.S. with the name David Loshin (I often get their emails!). A data set that has the same name twice referring to two different people does not have a duplicate. 

A naive MDM implementation will not be well positioned to manage these issues; simply applying identity resolution tools to match records may result in both false positives and false negatives. An effective MDM program will consider all the characteristics and attributes necessary to use identity resolution tools to enable the materialization and rendering of matched data in the ways that best meet the business use cases. 

What's the difference between building a "single source of truth" and what modern MDM should actually deliver?

First, I bristle at the term “single source of truth.” In a modern enterprise data landscape, it is unlikely that any of the systems could be deemed a “source of truth.” Even more important to consider is that no MDM framework is going to be able to copy multiple data instances from a variety of sources to create yet another inconsistent data instance that could be qualified as “truth.” In reality, different business perspectives frame the way that data is interpreted, and so what is considered to be truthful to one user may be irrelevant to others. 

A modern MDM system should deliver what I discuss in the course: "an environment composing technology and information that provides a means for consistently identifying, sourcing, organizing, synchronizing, and easily accessing a desired perspective of authoritative data about unique entities drawn from across an information landscape."

What are the biggest takeaways attendees of your course "Modernizing Master Data Management" can expect? 

In the course, I make a point of discussing how impediments to MDM success continue to plague initiatives, whether that is because MDM is solely being driven by the technology teams or data teams, that the acquisition of technology products is considered enough of a solution, or that the established software solution’s performance is inadequate to meet the business needs. In turn, the issues that motivated the introduction of master data management in the first place continue to impact the quality and usability of entity information. Technology-driven initiatives that are not fundamentally rooted in improving some aspect of the business often increase the technical debt and add complexity to the environment that is not balanced by appreciable increased value.

So the biggest takeaway is that instead of thinking about MDM as a technology solution, we need to reframe MDM in a way that is largely driven by the business: identify business use cases, assess information value and opportunities for monetization, and then drive process design and tools acquisition based on the synthesized user requirements. This modernized approach helps identify the technical requirements based on the information needs in accordance with the business use cases. By identifying those use cases before opting for a technical solution, your organization may be able to prioritize needs conformed across multiple monetization opportunities.

Preview David's MDM course on TDWI.org here.