RESEARCH & RESOURCES

Q&A

Getting Started with Master Data Management

Clearing up the misunderstandings, and understanding the challenges, of an MDM project

We explore five aspects Baseline Consulting has identified that make up MDM and the top mistakes companies make when beginning their MDM project.

“When companies think MDM can replace a data warehouse, there’s clearly a misunderstanding of these two complementary technologies. That’s like saying the Dewey decimal system for organizing books is a replacement for the books themselves.” So says Evan Levy, a partner and co-founder of Baseline Consulting, where he leads both practitioners and executives in delivering a range of IT solutions.

In this interview, he dispels some common master data management myths, and discusses some of what he is seeing in the field with Baseline clients who are wrestling with MDM solutions.

TDWI: What’s the biggest challenge you see with clients regarding MDM?

Evan Levy: Here’s the real challenge we’re seeing with MDM: A lot of people don't realize this, but it's not about how the data is used. Whether I'm using it for a report, to support a business operation, or in an application like CRM, master data is master data.

We've seen a lot of noise and confusion because certain vendors don't support the breadth of MDM functionality. The fact is, MDM is not a single function. [At Baseline Consulting,] we’ve identified five discrete aspects (or functions) inside MDM, if you will: Content, relationship, access, change management, and processing. Let me discuss each.

There's content, which means, for example, how to identify individuals -- by first name, last name, address, and so forth. How do I represent those values? How will the first and last name be identified or represented? Is the “address” sufficient or do I need to identify a home address and a business address -- and so forth. How do I identify the attributes that describe the master element?

Relationship is the second area and identifies if and how individual master elements may be related to one another (such as groupings and hierarchies). Is this person a member of the household or other common group, like people who work together? Is this product a taxable or non-taxable product?

Access is the third area, and identifies the accessibility of the individual master elements (and their associated descriptive details). In addition to the traditional data-oriented functions such as create, read, write, and delete access, MDM provides an additional level of capabilities that controls access to the individual elements. This ensures that both analytical and operational systems can share the same content without stepping outside the respective access boundaries.

The fourth area is change management, which focuses on ensuring that any changes to the master elements are managed and communicated effectively to prevent the misuse or misunderstanding of the information. The aspect of change management within MDM is ensuring that there’s synchronization and reconciliation of the changed data with all the parties that rely on it. A good example is when the phone company decides to come out with a new area code. There are well-defined methods to ensure that everyone is told in advance and prepared for the change.

Finally, there’s processing. Although the prior details we’ve discussed encompass the breadth of functions included within MDM, the final area focuses on the actual processing that can utilize these functions. The first four areas identify the “whats” inside MDM; the fifth identifies the “how.” This area focuses on defining the rules of MDM processing: identify (or read), match, create, update, merge, and delete. A good example is how the MDM system determines if two master records (e.g. John Smith and J. Smith) refer to the same individual. Another example might be comparing two master records (John Smith and Mary Smith) to determine if they are members of the same household. Processing actually is nothing more than a set of rules (or logic) that determines how the MDM processes are executed.

When you take a look at that breadth of functionality, one of the biggest challenges is to say, OK, out of all of the details within the five functional areas -- content, relationship, access, change management, and processing -- what is it that you need to support your business needs? Once we know that, we can figure out how to solve your problem.

So a big challenge is to make MDM specific to your business and your needs?

You mustn’t bite off too much. With BI, there's a world of difference between doing a canned report and doing data mining. No one should start on data mining because you need clean data and you need to get people to trust the content. You need a certain amount of data volume and data breadth to be able to be successful. If all I have is your name, address, and phone number, I can't really tell you about purchase trends.

The same is true of MDM. It’s important to identify the business capabilities that require MDM functionality and then determine the individual MDM functions that will be necessary to support the business. I’ve seen too many projects that were unnecessarily complicated because the development team was focused on implementing all the functionality available within a product rather than focusing on the functionality required by the business.

How do I decide where to start if I want to initiate some sort of MDM project at my company?

First of all, the word that I love to use is scope. Tell me your need, pain, or problem, and then tell me the scope. Everybody talks about enterprise data warehousing, but when you get down to it, [clients often say that] what would really help the business is if I just had this report for this group -- we could save a million dollars. Well then, let's not build an enterprise intergalactic warehouse, let's put an earmark for that group and solve that million-dollar problem.

With MDM, let's understand what the scope is. Is it enterprisewide? Is it line of business? Is it organizational? Is it a group? Is it a project? Let's understand where confusing reference data is causing a problem in business decision-making or business action.

If I don't have to share data across systems, and I don't have to integrate it, then I don't need master data. The need for master data occurs if there is more than one instance of subject area content. If your company has only one system containing customer identify and address, then you may (unknowingly) already have mastered your customer information.

MDM is required when there is more than one instance (or system) containing subject area information. If I have a choice of which copy of data to use, I likely need MDM.

What are some top mistakes you see companies making when they come to you and ask for help with MDM?

The most common mistake -- or call it the most common misunderstanding -- is the presumption that MDM will replace the data warehouse. No. MDM allows me to capture and reuse the management and integration logic associated with an individual subject area. This ensures that every system that needs to identify, match, update, read, or write subject area data does it in a consistent manner. This ensures that all of the developers who need to integrate or retrieve data from multiple systems for their application don't have to reinvent that logic -- they reuse that logic. The MDM hub contains that intellectual capital. Inherently, it doesn't extract or load data on the fly for operational systems; it contains the logic and will process the data. The individual application systems handle the movement of the data between systems.

The other big misconception is that everybody seems to think MDM is just storing your list. It's actually about managing volatility and change to the list, and communicating that change to all interested parties. That’s why we typically spend time discussing the five different functional aspects of MDM with clients. It’s not uncommon for the big “aha” to occur once we’ve discussed the change management area.

It sounds like there are some fundamental misunderstandings of what MDM really is.

Yes, that’s part of the challenge right now, because MDM is so popular and trendy.

Where's the ROI in MDM? Is there generally a measurable return on investment?

There is. I would characterize it as an IT level of infrastructure return, which is removing and reducing data integration cost significantly. We all know that with BI and other systems, 40 percent of the cost of the development is in data integration, and a lot of that activity is dealing with cleansing, standardizing, and determining how data integrates from different sources.

With ETL costing 40 percent for BI, take a look at what's involved with MDM. I'm going to profile and learn about the data once, and I'm going to build the integration. Then I'm done. When you bake a cake, you follow the recipe. You don’t invent it each time. With ETL or data integration, they’re inventing the recipe every time.

That sounds like a pretty significant savings.

It is. Think about how companies are budgeted and run. You hire sales people based upon the customers you have. You have finance, purchasing, and other organizations based upon the quantity of suppliers that you have. With numbers potentially being bloated, you have unknown risk and waste throughout the organization. It's not about getting rid of people, it's about making people more efficient.

Another thing that we still find is that the amount of time that business people spend searching for information is staggering. It's not uncommon for a business analyst or data analyst to burn an hour or two a day searching for data just to be able to do their analysis. Do the math. You can start seeing instances where the savings is allowing workers to focus on their actual jobs. The ROI is simply more accurate information.

TDWI Membership

Get immediate access to training discounts, video library, research, and more.

Find the right level of Membership for you.