By using tdwi.org website you agree to our use of cookies as described in our cookie policy. Learn More

TDWI Blog

Philip RussomPhilip Russom, Ph.D., is senior director of TDWI Research for data management and is a well-known figure in data warehousing, integration, and quality, having published over 550 research reports, magazine articles, opinion columns, and speeches over a 20-year period. Before joining TDWI in 2005, Russom was an industry analyst covering data management at Forrester Research and Giga Information Group. He also ran his own business as an independent industry analyst and consultant, was a contributing editor with leading IT magazines, and a product manager at database vendors. His Ph.D. is from Yale. You can reach him by email (prussom@tdwi.org), on Twitter (twitter.com/prussom), and on LinkedIn (linkedin.com/in/philiprussom).


Next Generation MDM – Executive Summary

Blog by Philip Russom
Research Director for Data Management, TDWI

[NOTE -- I recently completed a TDWI Best Practices Report titled Next Generation Master Data Management. The goal is to help user organizations understand MDM lifecycle stages so they can better plan and manage them. TDWI will publish the 36-page report in a PDF file in early April 2012, and anyone will be able to download it from www.tdwi.org. In the meantime, I’ll provide some “sneak peeks” by blogging excerpts from the report. Here’s the fifth excerpt, which is the Executive Summary at the beginning of the report.]

EXECUTIVE SUMMARY

Master data management (MDM) is one of the most widely adopted data management disciplines of recent years. That’s because the consensus-driven definitions of business entities and the consistent application of them across an enterprise are critical success factors for important cross-functional business activities, such as business intelligence (BI), complete views of customers, operational excellence, supply chain optimization, regulatory reporting, compliance, mergers and acquisitions, and treating data as an enterprise asset. Due to these compelling business reasons, many organizations have deployed their first or second generation of MDM solutions. The current challenge is to move on to the next generation.

Basic versus advanced MDM functions and architectures draw generational lines that users must now cross.

For example, some MDM programs focus on the customer data domain, and they need to move on to other domains, like products, financials, partners, employees, and locations. MDM for a single application (such as enterprise resource planning [ERP] or BI) is a safe and effective start, but the point of MDM is to share common definitions and reference data across multiple, diverse applications. Most MDM hubs support basic functions for the offline aggregation and standardization of reference data, whereas they should also support advanced functions for identity resolution, two-way data sync, real-time operation, and approval workflows for newly created master data. In parallel to these generational shifts in users’ practices, vendor products are evolving to support advanced MDM functions, multi-domain MDM applications, and collaborative governance environments.

Users invest in MDM to create complete views of business entities and to share data enterprisewide.

According to survey respondents, the top reasons for implementing an MDM solution are to enable complete views of key business entities (customers, products, employees, etc.) and to share data broadly but consistently across an enterprise. Other reasons concern the enhancement of BI, operational excellence, and compliance. Respondents also report that MDM is unlikely to succeed without strong sponsorship and governance, and MDM solutions need to scale up and to cope with data quality (DQ) issues, if they are to succeed over time.

“Customer” is, by far, the entity most often defined via MDM. This prominence makes sense, because conventional wisdom says that any effort to better understand or serve customers has some kind of business return that makes the effort worthwhile. Other common MDM entities are (in survey priority order) products, partners, locations, employees, and financials.

Users continue to mature their MDM solutions by moving to the next generation.

MDM maturity is good, in that 60% of organizations surveyed have already deployed MDM solutions, and over one-third practice multi-data-domain MDM today. On the downside, most MDM solutions today are totally or partially homegrown and/or hand coded. But on the upside, homegrown approaches will drop from 45% today to 5% within three years, while dedicated MDM application or tool usage will jump from 12% today to 47%. To achieve generational change, half of organizations anticipate replacing their current MDM platform(s) within five years.

The usage of most MDM features and functions will grow in MDM’s next generation.

Over the next three years, we can expect the strongest growth among MDM features and functions for real-time, collaboration, data sync, tool use, and multistructured data. Good growth is also coming with MDM functions for workflow, analytics, federation, repositories, and event processing. Some MDM options will experience limited growth, because they are saturated (services, governance, quality) or outdated (batch processing and homegrown solutions).

This report helps user organizations understand all that MDM now offers, so they can successfully modernize and build up their best practices in master data management. To that end, it catalogs and discusses new user practices and technical functions for MDM, and it uses survey data to predict which MDM functions will grow most versus those that will decline—all to bring readers up to date so they can make informed decisions about the next generation of their MDM solutions.

=================================================
ANNOUNCEMENT
Please attend the TDWI Webinar where I present the findings of my TDWI report Next Generation MDM, on April 10, 2012 Noon ET. Register online for the Webinar.

Posted by Philip Russom, Ph.D. on March 30, 20120 comments


The Top Ten Priorities for Next Generation MDM

Blog by Philip Russom
Research Director for Data Management, TDWI

[NOTE -- I recently completed a TDWI Best Practices Report titled Next Generation Master Data Management. The goal is to help user organizations understand MDM lifecycle stages so they can better plan and manage them. TDWI will publish the 40-page report in a PDF file on April 2, 2012, and anyone will be able to download it from www.tdwi.org. In the meantime, I’ll provide some “sneak peeks” by blogging excerpts from the report. Here’s the fourth excerpt, which is the ending of the report.]

The Top Ten Priorities for Next Generation MDM
The news in this report is a mix of good and bad. Half of the organizations interviewed and surveyed are mired in the early lifecycle stages of their MDM programs, unable to get over certain humps and mature into the next generation. On the flip side, the other half is well into the next generation, which proves it can be done.

To help more organizations safely navigate into next generation master data management, let’s list its top ten priorities, with a few comments why these need to replace similar early phase capabilities. Think of these priorities as recommendations, requirements, or rules that can guide user organizations into the next generation.

1. Multi-data-domain MDM. Many organizations apply MDM to the customer data domain alone, and they need to move on to other domains, like products, financials, and locations. Single-data-domain MDM is a barrier to correlating information across multiple domains.

2. Multi-department, multi-application MDM. MDM for a single application (such as ERP, CRM or BI) is a safe and effective start. But the point of MDM is to share data across multiple, diverse applications and the departments that depend on them. It’s important to overcome organizational boundaries if MDM is to move from being a local fix to being an infrastructure for sharing data as an enterprise asset.

3. Bidirectional MDM. Roach Motel MDM is when you extract reference data and aggregate it in a master database from which it never emerges (as with many BI and CRM systems). Unidirectional MDM is fine for profiling reference data. But bidirectional MDM is required to improve or author reference data in a central place, then publish it out to various applications.

4. Real-time MDM. The strongest trend in data management today (and BI/DW, too) is toward real-time operation as a complement to batch. Real-time is critical to verification, identity resolution, and the immediate distribution of new or updated reference data.

5. Consolidating multiple MDM solutions. How can you create a single view of the customer when you have multiple customer-domain MDM solutions? How can you correlate reference data across domains when the domains are treated in separate MDM solutions? For many organizations, next-generation MDM begins with a consolidation of multiple, siloed MDM solutions.

6. Coordination with other disciplines. To achieve next-generation goals, many organizations need to stop practicing MDM in a vacuum. Instead of MDM as merely a technical fix, it should also align with business goals for data. And MDM should be coordinated with related data management disciplines, especially DI and DQ. A program for data governance or stewardship can provide an effective collaborative process for such coordination.

7. Richer modeling. Reference data in the customer domain works fine with flat modeling, involving a simple (but very wide) record. However, other domains make little sense without a richer, hierarchical model, as with a chart of accounts in finance or a bill of material in manufacturing. Metrics and KPIs – so common in BI, today – rarely have proper master data in multidimensional models.

8. Beyond enterprise data. Despite the obsession with customer data that most MDM solutions suffer, almost none of them today incorporate data about customers from Web sites or social media. If you’re truly serious about MDM as an enabler for CRM, next-generation MDM (and CRM, too) must reach into every customer channel. In a related area, users need to start planning their strategy for MDM with big data and advanced analytics.

9. Workflow and process management. Too often, development and collaborative efforts in MDM are mostly ad hoc actions with little or no process. For an MDM program to scale up and grow, it needs workflow functionality that automates the proposal, review, and approval process for newly created or improved reference data. Vendor tools and dedicated applications for MDM now support workflows within the scope of their tools. For a broader scope, some users integrate MDM with business process management tools.

10. MDM solutions built atop vendor tools and platforms. Admittedly, many user organizations find that home-grown and hand-coded MDM solutions provide adequate business value and technical robustness. However, these are usually simple, siloed departmental solutions. User organizations should look into vendor tools and platforms for MDM and other data management disciplines when they need broader data sharing and more advanced functionality, such as real-time operation, two-way sync, identity resolution, event processing, service orientation, and process workflows or other collaborative functions.

================================

ANNOUNCEMENTS
Although the above version of the top ten list is excerpted from the upcoming TDWI report on MDM, an earlier version of this list was developed in the TDWI blog “Rules for the Next Generation of MDM.“

Be sure to visit www.tdwi.org on April 2 or later, to download your own free copy of the complete TDWI report on Next Generation Master Data Management.

Please attend the TDWI Webinar where I present the findings of my TDWI report Next Generation MDM, on April 10, 2012 Noon ET. Register online for the Webinar.

Posted by Philip Russom, Ph.D. on March 16, 20120 comments


The Three Core Activities of MDM (part 3)

Blog by Philip Russom
Research Director for Data Management, TDWI

I’ve just completed a TDWI Best Practices Report titled Next Generation Master Data Management. The goal is to help user organizations understand MDM lifecycle stages so they can better plan and manage them. TDWI will publish the 40-page report in a PDF file on April 2, 2012, and anyone will be able to download it from www.tdwi.org. In the meantime, I’ll provide some “sneak peeks” by blogging excerpts from the report. Here’s the third in a series of three excerpts. If you haven’t already, you should read the first excerpt and the second excerpt before continuing.

Technical Solutions for MDM
An implementation of MDM can be complex, because reference data needs a lot of attention, as most data sets do. MDM solutions resemble data integration (DI) solutions (and are regularly mistaken for them), in that MDM extracts reference data from source systems, transforms it to normalized models that comply with internal MDM standards, and aggregates it into a master database where both technical and business people can profile it to reveal duplicates and non-compliant records. Depending on the architecture of an MDM solution, this database may also serve as an enterprise repository or system of record for so-called golden records and other persistent reference records. If the MDM solution supports a closed loop, records that are improved in the repository are synchronized back to the source systems from which they came. Reference data may also be outputted to downstream systems, like data warehouses or marketing campaign systems.

MDM solutions also resemble data quality (DQ) solutions, in that many data quality functions are applied to reference data. For example, “customer” is the business entity most often represented in reference data. Customer data is notorious for data quality problems that demand remediation, and customer reference data is almost as problematic. We’ve already mentioned deduplication and standardization. Other data quality functions are also applied to customer reference data (and sometimes other entities, too), including verification and data append. Luckily, most tools for MDM (and related disciplines such as data integration and data quality) can automate the detection and correction of anomalies in reference data. Development of this automation often entails the creation and maintenance of numerous “business rules,” which can be applied automatically by the software, once deployed.

================================

ANNOUNCEMENTS
Keep an eye out for another MDM blog, coming March 16. I’ll tweet, so you know when that blog is posted.

Please attend the TDWI Webinar where I will present the findings of my TDWI report Next Generation MDM, on April 10, 2012 Noon ET. Register online for the Webinar.

Posted by Philip Russom, Ph.D. on March 2, 20120 comments


The Three Core Activities of MDM (part 2)

Blog by Philip Russom
Research Director for Data Management, TDWI

I’ve just completed a TDWI Best Practices Report titled Next Generation Master Data Management. The goal is to help user organizations understand MDM lifecycle stages so they can better plan and manage them. TDWI will publish the 40-page report in a PDF file on April 2, 2012, and anyone will be able to download it from www.tdwi.org. In the meantime, I’ll provide some “sneak peeks” by blogging excerpts from the report. Here’s the second in a series of three excerpts. If you haven’t already, you should read the first excerpt before continuing.

Collaborative Processes for MDM
By definition, MDM is a collaborative discipline that requires a lot of communication and coordination among several types of people. This is especially true of entity definitions, because there is rarely one person who knows all the details that would go into a standard definition of a customer or other entity. The situation is compounded when multiple definitions of an entity are required to make reference data “fit for purpose” across multiple IT systems, lines of business, and geographies. For example, sales, customer service, and finance all interact with customers, but have different priorities that should be reflected in a comprehensive entity model. Likewise, technical exigencies of the multiple IT systems sharing data may need addressing in the model. And many entities are complex hierarchies or have dependencies that take several people to sort out, as in a bill of material (for products) or a chart of accounts (for financials).

Once a definition is created from a business viewpoint, further collaboration is needed to gain review and approval before applying the definition to IT systems. At some point, business and technical people come together to decide how best to translate the definition into the technical media through which a definition is expressed. Furthermore, technical people working on disparate systems must collaborate to develop the data standards needed for the exchange and synchronization of reference data across systems. Since applying MDM definitions often requires that changes be made to IT systems, managing those changes demands even more collaboration.

That’s a lot of collaboration! To organize the collaboration, many firms put together an organizational structure where all interested parties can come together and communicate according to a well-defined business process. For this purpose, data governance committees or boards have become popular, although stewardship programs and competency centers may also provide a collaborative process for MDM and other data management disciplines (especially data quality).

================================
ANNOUNCEMENTS
Keep an eye out for part 3 in this MDM blog series, coming March 2. I’ll tweet so you know when that blog is posted.

David Loshin and I will moderate the TDWI Solution Summit on Master Data, Quality, and Governance, coming up March 4-6, 2012 in Savannah, Georgia.

Please attend the TDWI Webinar where I will present the findings of my TDWI report Next Generation MDM, on April 10, 2012 Noon ET. Register online for the Webinar.

Posted by Philip Russom, Ph.D. on February 17, 20120 comments


The Three Core Activities of MDM (part 1)

Blog by Philip Russom
Research Director for Data Management, TDWI

I’ve just completed a TDWI Best Practices Report titled Next Generation Master Data Management. The goal is to help user organizations understand MDM lifecycle stages so they can better plan and manage them. TDWI will publish the 40-page report in a PDF file on April 2, 2012, and anyone will be able to download it from www.tdwi.org. In the meantime, I’ll provide some “sneak peeks” by blogging excerpts from the report. Here’s the first in a series of three excerpts.

Defining Master Data Management
To get us all on the same page, let’s start with a basic definition of MDM, then drill into details:

Master data management (MDM) is the practice of defining and maintaining consistent definitions of business entities (e.g., customer or product) and data about them across multiple IT systems and possibly beyond the enterprise to partnering businesses. MDM gets its name from the master and/or reference data through which consensus-driven entity definitions are usually expressed. An MDM solution provides shared and governed access to the uniquely identified entities of master data assets, so those enterprise assets can be applied broadly and consistently across an organization.

That’s a good nutshell definition of what MDM is. However, to explain in detail what MDM does, we need to look at the three core activities of MDM, namely: business goals, collaborative processes, and technical solutions.

Business Goals and MDM
Most organizations have business goals, such as retaining and growing customer accounts, optimizing a supply chain, managing employees, tracking finances accurately, or building and supporting quality products. All these and other data-driven goals are more easily and accurately achieved when supported by master data management. That’s because most business goals focus on a business entity, such as a customer, supplier, employee, financial instrument, or product. Some goals combine two or more entities, as in customer profitability (customers, products, and finances) or product quality (suppliers and products). MDM contributes to these goals by providing processes and solutions for assembling complete, clean, and consistent definitions of these entities and reference data about them. Many business goals span multiple departments, and MDM prepares data about business entities so it can be shared liberally across an enterprise.

Sometimes the business goal is to avoid business problems. As a case in point, consider that one of the most pragmatic applications of MDM is to prevent multiple computer records for a single business entity. For example, multiple departments of a corporation may each have a customer record for the same customer. Similarly, two merging firms end up with multiple records when they have customers in common.

Business problems ensue from redundant customer records. If the records are never synchronized or consolidated, the firm will never understand the complete relationship it has with that customer. Undesirable business outcomes include double billing and unwarranted sales attempts. From the view of a single department, the customer’s commitment seems less than it really is, resulting in inappropriately low discounts or service levels. MDM alleviates these problems by providing collaborative processes and technical solutions that link equivalent records in multiple IT systems, so the redundant records can be synchronized or consolidated. Deduplicating redundant records is a specific use case within a broader business goal of MDM, namely to provide complete and consistent data (especially views of specific business entities) across multiple departments of a larger enterprise, thereby enabling or improving cross-functional business processes.

================================

ANNOUNCEMENTS
Keep an eye out for part 2 and part 3 in this MDM blog series, coming February 17 and March 2, respectively. I’ll tweet so you know when each blog is posted.

David Loshin and I will moderate the TDWI Solution Summit on Master Data, Quality, and Governance, coming up March 4-6, 2012 in Savannah, Georgia. You should attend!


Please attend the TDWI Webinar where I present the findings of my TDWI report Next Generation MDM, on April 10, 2012 Noon ET. Register online for the Webinar.

Posted by Philip Russom, Ph.D. on February 3, 20120 comments


Big Data Analytics: 2012 New Year's Predictions

By Philip Russom

Before January runs out, I thought I should tender a few prognostications for 2012. Sorry to be so late with this, but I have a demanding day job. Without further ado, here are a few trends, practices, and changes I feel we can expect in 2012.

Big data will get bigger. But, then, you knew that. Enough said.

The connection between big data and advanced analytics will get even stronger. My base assumption is that advanced analytics has become such an important priority for user organizations that it’s influencing most of what we do in business intelligence (BI), data warehousing (DW), and data management (DM). It even influences our attitudes toward big data. After all, the current frenzy – which will become more operationalized than ad hoc in 2012 – is to apply advanced analytic techniques to big data. In other words, don’t do one without the other, if you’re a BI professional.

From problem to opportunity. The survey for my recent TDWI report on Big Data Analytics shows that 70% of organizations already think of big data as an asset to be leveraged, largely through advanced analytics. In 2012, the other 30% will come around.

From hoarding to collecting. As a devotee of irony, I’m amused to see reality TV shows about collectibles and hoarding run back-to-back. Practices lauded in the former are abhorred in the latter, yet the line between collecting and hoarding is a thin one. Big data is a case in point. Many organizations have hoarded Web logs, RFID streams, and other big data sets for years. The same organizations are now turning the corner into collecting these with a dedicated purpose, namely analytics.

Advanced analytics will become as commonplace as OLAP. Okay, I admit that I’m exaggerating for dramatic effect. But, I have to say that big data alone has driven many organizations beyond OLAP into advanced forms of analytics, namely those based on mining, statistics, complex SQL, and natural language processing. This trend has been running for almost five years; there may be another five in it.

God is in the details. Or is the devil in the details? I guess it depends on what we’re talking about. With big data analytics, expect to see far more granular detail than ever before. For example, most 360-degree customer views today include hundreds of customer attributes. Big data can bump that up to thousands of attributes, which in turn provides greater detail and precision for customer-base segmentation and other customer analytics, both old and new.

Multi-structured data. Are you as sick of the “structured data versus unstructured data” comparison as I am? This tired construct doesn’t really work with big data, because it’s often a mix of structured, semi-structured, and unstructured data, plus gradations among these. I like the term “multi-structured data” (which I admit that I picked up from Teradata folks) because the term covers the whole range and it reminds us that big data is often a kind of mashup. To get full business value out of big data through analytics, more user organizations will invest in people skills and tools that span the full range of multi-structured data.

You will change your data warehouse architecture. At least, you will if you’re truly satisfying the requirements of big data analytics. Let’s be honest. Most EDWs are designed and optimized by their technical users for reporting, performance management, OLAP, and not much else. This is both a user design issue and a vendor platform issue. In recent years, I’ve seen tons of organizations rearchitect their EDWs (and sometimes swap platforms) to accommodate massive big data, multi-structured data, real-time big streams, and the demanding workloads of advanced analytics. This painful-but-necessary trend is long from over.

I’m stopping here because I’ve reached my target word count. And my growling stomach says it’s lunch time. But you get the idea. The business value of advanced analytics and the nuggets to be mined from big data have driven a lot of change recently, and will continue to do so throughout 2012.

SUGGESTED READING:
For a detailed discussion, see the TDWI Best Practices Report, titled Big Data Analytics, which is available in a PDF file via a free download.

You can also replay my TDWI Webinar, where I present the findings of the Big Data Analytics report.

For a discussion of similar issues, download the TDWI Checklist Report, titled Hadoop: Revealing Its True Value for Business Intelligence.

And you can replay last month’s TDWI Webinar, in which I led a panel of vendor representatives in a discussion of Hadoop and related technologies.

Philip Russom is the research director for data management at The Data Warehousing Institute (TDWI). You can reach him at prussom@tdwi.org or follow him as @prussom on Twitter.

Posted by Philip Russom, Ph.D. on January 23, 20120 comments