By Philip Russom
Research Director for Data Management, TDWI
To help you better understand new practices for managing big data and why you should care, I’d like to share with you the series of 30 tweets I recently issued on the topic. I think you’ll find the tweets interesting, because they provide an overview of big data management and its best practices in a form that’s compact, yet amazingly comprehensive.
Every tweet I wrote was a short sound bite or stat bite drawn from my recent TDWI report “Managing Big Data.” Many of the tweets focus on a statistic cited in the report, while other tweets are definitions stated in the report.
I left in the arcane acronyms, abbreviations, and incomplete sentences typical of tweets, because I think that all of you already know them or can figure them out. Even so, I deleted a few tiny URLs, hashtags, and repetitive phrases. I issued the tweets in groups, on related topics; so I’ve added some headings to this blog to show that organization. Otherwise, these are raw tweets.
Types of Multi-Structured Data Managed as Big Data
1. #TDWI SURVEY SEZ: 26% of users manage #BigData that’s ONLY structured, usually relational.
2. #TDWI SURVEY SEZ: 31% manage #BigData that’s eclectic mix of struc, unstruc, semi, etc.
3. #TDWI SURVEY SEZ: 38% don’t have #BigData by any definition. Hear more in #TDWI Webinar Oct.8 noonET http://bit.ly/BDMweb
4. Structured (relational) data from traditional apps is most common form of #BigData.
5. #BigData can be industry specific, like unstruc’d text in insurance, healthcare & gov.
6. Machine data is special area of #BigData, with as yet untapped biz value & opportunity.
Reasons for Managing Big Data Well
7. Why manage #BigData? Keep pace w/growth, biz ROI, extend ent data arch, new apps.
8. Want to get biz value from #BigData? Manage #BigData for purposes of advanced #analytics.
9. #BigDataMgt yields larger samples for apps that need it: 360° views, risk, fraud, customer seg.
10. #TDWI SURVEY SEZ: 89% feel #BigDataMgt is opportunity. Mere 11% think it’s a problem.
11. Key benefits of #BigDataMgt are better #analytics, datasets, biz value, sales/marketing.
12. Barriers to #BigDataMgt: low maturity, weak biz support, new design paradigms.
13. #BigDataMgt non-issues: bulk load, query speed, scalability, network bandwidth.
Strategies for Users’ Big Data Management Solutions
14. #TDWI SURVEY SEZ: 10% have #BigDataMgt solution in production; 10% in dev; 20% prototype; 60% nada. #TDWI Webinar Oct.8 http://bit.ly/BDMweb
15. #TDWI SURVEY SEZ: Most common strategy for #BigDataMgt: extend existing DataMgt systems.
16. #TDWI SURVEY SEZ: 2nd most common strategy for #BigDataMgt: deploy new DataMgt systems for #BigData.
17. #TDWI SURVEY SEZ: 30% have no strategy for #BigDataMgt though they need one.
18. #TDWI SURVEY SEZ: 15% have no strategy for #BigDataMgt cuz they don’t need one.
Ownership and Use of Big Data Management Solutions
19. Some depts. & groups have own #BigDataMgt platforms, including #Hadoop. Beware teramart silos!
20. Trend: #BigDataMgt platforms supplied by IT as infrastructure. Imagine shared #Hadoop cluster.
21. Who does #BigDataMgt? analysts 22%; architects 21%; mgrs 21%; tech admin 13%; app dev 11%.
Tech Specs for Big Data Management Solutions
22. #TDWI SURVEY SEZ: 97% of orgs manage structured #BigData, followed by legacy, semi-struc, Web data etc.
23. Most #BigData stored on trad drives, but solid state drives & in-memory functions are gaining.
24. #TDWI SURVEY SEZ: 10-to-99 terabytes is the norm for #BigData today.
25. #TDWI SURVEY SEZ: 10% have broken the 1 petabyte #BigData barrier. Another 13% will within 3 years.
A Few Best Practices for Managing Big Data
26. For open-ended discovery-oriented #analytics, manage #BigData in original form wo/transformation.
27. Reporting and #analytics are different practices; managing #BigData for each is, too.
28. #BigData needs data standards, but different ones compared to other enterprise data.
29. Streaming #BigData is easy to capture & manage offline, but tough to process in #RealTime.
30. Non-SQL, non-relational platforms are coming on strong; BI/DW needs them for diverse #BigData.
Want to learn more about managing big data?
For a much more detailed discussion—
in a traditional publication!—get the TDWI Best Practices Report, titled
Managing Big Data, available in a PDF file via a free download.
You can also register for and replay my
TDWI Webinar, where I present the findings of
Managing Big Data.
Please consider taking courses at the
TDWI World Conference in Boston, October 20–25, 2013. Enroll online.
============================
Philip Russom is the research director for data management at TDWI. You can reach him at
[email protected] or follow him as @prussom on Twitter.
Posted by Philip Russom, Ph.D. on October 11, 20130 comments
Good information and analytics are vital to enabling organizations of all stripes to survive tumultuous changes in the healthcare landscape. The latest issue of TDWI’s
What Works in Healthcare focuses on data-driven transformations in healthcare. I wrote an article for the issue that looks at some of the business intelligence and analytics issues surrounding the transition from a traditional, fee-for-service system to a value-based, “continuum of care” approach. One thing is clear: The importance of data and information integration as the fabric of this approach cannot be overstated.
A continuum (or “continuity”) of care is where a patient’s care experiences are connected across multiple providers: doctors, therapists, clinics, hospitals, pharmacies, and so on, including social programs. The traditional, fee-based approach has encouraged a disconnected experience for patients; visits to providers are mutually exclusive events and their patient data also lives in disparate silos. This disconnect increases the risk of patients getting the wrong treatments, taking medications improperly due to poor follow-up, or falling through the cracks entirely until there is an emergency. When patients only engage with healthcare when there is an emergency, costs go up. If there is poor follow-up after a hospital or emergency care visit, there is a greater likelihood that patients will have to be readmitted soon for the same problem.
Information integration plays a key role in the business model convergence that many experts envision as essential to improving care. “We see new partnerships or communities of care forming to improve collaboration across boundaries,” said Karen Parrish, IBM VP of Industry Solutions for the Public Sector during a recent conversation about IBM’s
Smarter Care. IBM’s ambitious program, announced in May, “enables new business and financial models that encourage interaction among government and social programs, healthcare practitioners and facilities, insurers, employers, life sciences companies and citizens themselves,” according to the company. Improving the continuum of a particular patient’s care among these participants will require good quality data and fewer barriers to the flow of information so that the right caregivers are involved, depending on the circumstances.
At the center of this information flow must be the patient. “Access to the unprecedented amount of data available today creates an opportunity for deeper insight and earlier intervention and engagement with the patient,” said Parrish. This includes unstructured data, such as doctor’s notes. In an insightful
interview with TDWI’s Linda Briggs, Ted Corbett, founder of Vizual Outcomes (and a speaker at the upcoming TDWI
BI Executive Summit in San Diego) points out that while unstructured data “houses some of the richest data in the hospital system…there is little consistency across providers in note format, which makes it difficult to access this rich store of information.”
To improve the speed and quality of unstructured data analysis, IBM puts forth its cognitive computing engine
Watson, which understands natural language. While Watson and cognitive computing are topics for another day, it’s clear that when we talk about information integration in healthcare, we have to remember that the vast majority of this information is unstructured. There will be increasing demand to apply machine learning and other computing power to draw intelligence from an integrated view of multiple sources of this information to improve patient care and treatment.
Posted by David Stodder on July 17, 20130 comments
By Philip Russom, TDWI Research Director for Data Management
This week, we at TDWI produced our fifth annual Solution Summit on Master Data, Quality, and Governance, this time on Coronado Island off San Diego. I moderated the conference, along with David Loshin (president of Knowledge Integrity). We lined up a host of great user speakers and vendor panelists. The audience asked dozens of insightful questions, and the event included hundreds of one-to-on meetings among attendees, speakers, and vendor sponsors. The aggregated result was a massive knowledge transfer that illustrates how master data management (MDM), data quality (DQ), and data governance (DG) are more vibrant than ever.
To give you a sense of the range of data management topics addressed at the summit, here’s an overview of some of the presentations heard at the TDWI Solution Summit for Master Data, Quality, and Governance:
Luuk van den Berg (Data Governance Lead, Cisco Systems) talked about MDM and DG tools and processes that are unique to Cisco, especially their method of “watermarking reports,” which enables them to certify the quality, source, and content of data, as well as reports that contain certified data.
Joe Royer (Enterprise Architect, Principal Financial Group) discussed how Principal found successful starting points for their DG program. Joe also discussed strategies for sustaining DG, different DG models, and how DG can accelerate MDM and DQ programs.
Rich Murnane (Director of Enterprise Data Operations, iJET International) described how iJET collects billions of data points annually to provide operational risk management solutions to their clients. To that end, iJET applies numerous best practices in MDM, DQ, DG, stewardship, and data integration.
Hope Johansen (Master Data Project Manager, Schlumberger). Because of mergers, Schlumberger ended up with many far-flung facilities, plus related physical assets and locations. Hope explained how her team applied MDM techniques to data about facilities (not the usual customer and product data domains!) to uniquely identify facilities and similar assets, so they could be optimized by the business.
Bruce Harris (Vice President, Chief Risk and Strategy Officer, Volkswagen Credit). VW Credit has the long-term goal of becoming a “Strategy Focused Organization”. Bruce explained how DG supports that business goal. As an experienced management executive who has long depended on data for operational excellence, Bruce recounted many actionable tips for aligning data management work with stated business goals.
Kenny Sargent (Technical Lead, Enterprise Data Products, Compassion International) described his method for “Validation-Driven Development.” In this iterative agile method, a developer gets an early prototype in front of business users ASAP to get feedback and to make course corrections as early as possible. The focus is on data, to validate early on that the right data is being used correctly and that the data meets business requirements.
Besides the excellent user speakers described above, the summit also had break out sessions with speeches by representatives from vendor firms that sponsored the summit. Sponsoring firms included: IBM, SAS, BackOffice Associates, Collibra, Compact Solutions, DataMentors, Information Builders, and WhereScape.
To learn more about the event, visit the Web site for the
2013 TDWI Solution Summit on Master Data, Quality and Governance.
Posted by Philip Russom, Ph.D. on June 5, 20130 comments
We are just weeks away from the
TDWI World Conference in Chicago (May 5-10), where the theme will be “Big Data Tipping Point.” I have it on good authority that by then, the current coldness will have passed and Chicago will be basking in beautiful spring weather. (If not, as they say, wait five minutes.) The theme of the World Conference is “Big Data Tipping Point,” which means that TDWI will feature many educational sessions to help you get beyond the big data hype and learn how to apply best practices and new technologies for conquering the challenges posed by rising data volumes and increased data variety.
I would like to highlight three sessions to be held at the conference that I see as important to this objective. The first actually does not have “big data” in its description but addresses what always appears in our research as a topmost concern: data integration. In many organizations, the biggest “big data” challenge is not so much about dealing with one large source as integrating many sources and performing analytics across them. Mark Peco will be teaching
“TDWI Data Integration Principles and Practices: Creating Information Unity from Data Disparity” on Monday, May 6.
On Wednesday, May 7, Dave Wells will head up
“TDWI Business Analytics: Exploration, Experimentation, and Discovery.” For most organizations, the central focus of big data thinking is about analytics; business leaders want to anchor decisions in sound data analysis and use data science practices to uncover new insights in trends, patterns, and correlations. Yet, understanding analytics techniques how to align them with business demands remains a barrier. Dave Wells does a great job of explaining analytics, how the practices relate to business intelligence, and how to bring the practices to bear to solve business problems.
The third session I’d like to spotlight is
“Building a Business Case for Big Data in Your Data Warehouse,” taught by Krish Krishnan. A critical starting point for big data projects and determining their relationship to the existing data warehouse is building the business case. Krish is great at helping professionals get the big picture and then see where to begin, so that you don’t get intimated by the scale. He will cover building the business case, the role of data scientists, and how next-generation business intelligence fits into the big data picture.
These are just three of the many sessions to be held during the week, on topics ranging from data mining, Hadoop, and social analytics to advanced data modeling and data virtualization. I hope you can attend the Chicago TDWI World Conference!
Posted by David Stodder on April 10, 20130 comments
Happy New Year to the TDWI Community! As we head into 2013, it’s clear that organizations will continue to face unpredictable economic currents and regulatory pressures, and will require better intelligence and faster decision processes. TDWI has just published a new Best Practices Report that I wrote, “Achieving Greater Agility with Business Intelligence.” This report focuses on how organizations can develop and deploy BI, analytics, and data warehousing to improve flexibility and decision-making speed. I hope you can attend our upcoming Webinar presentation of the report, to be held on January 15, which will look in-depth at the research findings and offer best practices recommendations for increasing agility.
Three key areas of innovation in technologies and practices that I covered in the report will clearly be important as organizations aim for higher agility in 2013. These include the following:
Managed, self-service BI and analytic data discovery of structured and unstructured data: Decision makers are demanding tools that will allow them to access, analyze, profile, cleanse, transform, and share information without having to wait for IT. They will need access to more than just historical, structured data found in traditional systems. Unified access to both structured and unstructured data is growing in importance as decision makers seek to perform complete, context-rich analysis against big data.
New data warehousing and integration options, including virtualization: Data integration can be the source of challenging and expensive problems. Organizations are evaluating the range of options, including data federation and virtualization, that can give users managed self-service. These could allow users to work more iteratively with IT to create comprehensive views of data in place without having to physically extract and move it into an application, data mart, or specialized data store.
Agile development methods: The use of agile methods, now a mainstream trend in software development, is having an increasing impact on BI and data warehousing. Organizations are proving that they by implementing Scrum and other techniques, they can remove a good deal of the wait and waste of traditional development processes.
In the report, we found that most organizations regard their agility – that is, their ability to adjust to change and take advantage of emerging opportunities – and merely “average.” No doubt, organizations seeking new competitive advantages in 2013 will demand better than that. They will be looking to their BI, analytics, and data warehousing systems to help them become reach a higher level of agility.
Posted by David Stodder on January 7, 20130 comments
Personal, self-service analytics and discovery is one of the most important trends not only in business intelligence (BI) but in user applications generally. Expensive, monster systems that have big footprints and are not flexible to meet dynamic business needs are increasingly viewed by users as legacy. Rather than work with monolithic, one-size-fits-all applications that are dominated by IT management and development, users today want freedom and agility. They do not want to wait weeks or months for changes; they want to tailor reporting, analysis, and data sharing to their immediate and often changing needs.
I recently wrote a TDWI Checklist Report on this topic. The report offers seven steps toward personal, self-service BI and analytics success, from taking new approaches to gathering user requirements to implementing in-memory computing, visualization, and enterprise integration. I hope you find this Checklist Report useful in your BI and analytics technology evaluations and deployments.
An important conclusion in the report is that perhaps ironically, IT data management is absolutely critical to the success of personal, self-service analytics and discovery. Nowhere is this truer than with enterprise data integration. Business users often require a mix of different types of data, including structured, detailed data, aggregate or dimensional data, and semi-structured or unstructured content. In addition, given that it is doubtful that users will give up their spreadsheets any time soon, systems must be able to import and export data and analysis artifacts to and from spreadsheets. Assembling and orchestrating access to such diverse data sources must not be left up to nontechnical users.
Thus, even as users celebrate the trend toward personal, self-service analytics and discovery, its success hinges on IT’s data management prowess to ensure data quality, enterprise integration, security, availability, and ultimately, business agility with information.
Posted by David Stodder on June 28, 20120 comments
Blog by Philip Russom
Research Director for Data Management, TDWI
To raise an awareness of what the Next Generation of Master Data Management (MDM) is all about, I recently issued a series of 35 tweets via Twitter, over a two-week period. The tweets also helped promote a TDWI Webinar on Next Generation MDM. Most of these tweets triggered responses to me or retweets. So I seem to have reached the business intelligence (BI), data warehouse (DW), and data management (DM) audience I was looking for – or at least touched a nerve!
To help you better understand Next Generation MDM and why you should care about it, I’d like to share these tweets with you. I think you’ll find them interesting because they provide an overview of Next Generation MDM in a form that’s compact, yet amazingly comprehensive.
Every tweet I wrote was a short sound bite or stat bite drawn from TDWI’s recent report on Next Generation MDM, which I researched and wrote. Many of the tweets focus on a statistic cited in the report, while other tweets are definitions stated in the report.
I left in the arcane acronyms, abbreviations, and incomplete sentences typical of tweets, because I think that all of you already know them or can figure them out. Even so, I deleted a few tiny URLs, hashtags, and repetitive phrases. I issued the tweets in groups, on related topics; so I’ve added some headings to this blog to show that organization. Otherwise, these are raw tweets.
Defining the Generations of MDM
1. #MDM is inherently a multigenerational discipline w/many life cycle stages. Learn its generations in #TDWI Webinar
2. User maturation, new biz reqs, & vendor advances drive #MDM programs into next generation. Learn more in #TDWI Webinar
3. Most #MDM generations incrementally add more data domains, dep’ts, data mgt tools, operational apps.
4. More dramatic #MDM generations consolidate redundant solutions, redesign architecture, replace platform.
Why Care About NG MDM?
5. Why care about NexGen #MDM? Because biz needs consistent data for sharing, BI, compliance, 360views.
6. Why care about NexGen #MDM? Most orgs have 1st-gen, homegrown solutions needing update or replacement.
The State of NG MDM
7. #TDWI SURVEY SEZ: #MDM adoption is good. 61% of surveyed orgs have deployed solution. Another 29% plan to soon.
8. #TDWI SURVEY SEZ: #MDM integration is not so good. 44% of solutions deployed are silos per dept, app, domain.
9. #TDWI SURVEY SEZ: Top, primary reasons for #MDM: 360-degree views (21%) & sharing data across enterprise (19%).
10. #TDWI SURVEY SEZ: Top, secondary reasons for #MDM: Data-based decisions (15%) & customer intelligence (13%).
11. #TDWI SURVEY SEZ: Other reasons for #MDM: operational excellence, reduce cost, audits, compliance, reduce risk.
12. #TDWI SURVEY SEZ: Top, primary #MDM challenges are lack of: exec sponsor, data gov, cross-function collab, biz driver.
13. #TDWI SURVEY SEZ: Other challenges to #MDM: growing reference data, coord w/other data mgt teams, poor data quality.
MDM’s Business Entities and Data Domains
14. #TDWI SURVEY SEZ: “Customer” is biz entity most often defined via #MDM (77%). But I bet you knew that already!
15. #TDWI SURVEY SEZ: Other #MDM entities (in survey order) are product, partner, location, employee, financial.
16. #TDWI SURVEY SEZ: Surveyed organizations have an average of 5 definitions for customer and 5 for product. #MDM
17. #TDWI TAKE: Multi-data-domain support is a key metric for #MDM maturity. Single-data-domain is a myopic silo.
18. #TDWI SURVEY SEZ: 37% practice multi-data-domain #MDM today, proving it can succeed in a wide range of orgs.
19. #TDWI SURVEY SEZ: Multi-data-domain maturity is good. Only 24% rely mostly on single-data-domain #MDM.
20. #TDWI SURVEY SEZ: A third of survey respondents (35%) have a mix of single- and multi-domain #MDM solutions.
Best Practices of Next Generation MDM
21. #TDWI TAKE: Unidirectional #MDM improves reference data but won’t share. Not a hub unless ref data flows in/out
22. #TDWI SURVEY SEZ: #MDM solutions today r totally (26%) or partially (19%) homegrown. Learn more in Webinar http://bit.ly/NG-MDM #GartnerMDM
23. #TDWI SURVEY SEZ: Users would prefer #MDM functions from suite of data mgt tools (32%) or dedicated #MDM app/tool (47%)
24. #TDWI Survey: 46% claim to be using biz process mgt (BPM) now w/#MDM solutions. 32% said integrating MDM w/BPM was challenging.
25. #TDWI SURVEY SEZ: Half of surveyed organizations (46%) have no plans to replace #MDM platform.
26. #TDWI SURVEY SEZ: Other half (50%) is planning a replacement to achieve generational change. Learn more in Webinar http://bit.ly/NG-MDM
27. Why rip/replace #MDM? For more/better tools, functions, arch, gov, domains, enterprise scope.
28. Need #MDM for #Analytics? Depends on #Analytics type. OLAP, complex SQL: Oh, yes. Data/text mining, NoSQL, NLP: No way.
Quantifying the Generational Change of MDM Features
29. #TDWI SURVEY SEZ: Expect hi growth (27% to 36%) in #MDM options for real-time, collab, ref data sync, tool use.
30. #TDWI SURVEY SEZ: Good growth (5% to 22%) coming for #MDM workflow, analytics, federation, repos, event proc.
31. #TDWI SURVEY SEZ: Some #MDM options will be flat due to saturation (gov, quality) or outdated (batch, homegrown).
Top 10 Priorities for Next Generation MDM
32. Top 10 Priorities for NG #MDM (Pt.1) 1-Multi-data-domain. 2-Multi-dept/app. 3-Bidirectional. 4-Real-time. #TDWI
33. Top 10 Priorities for NG #MDM (Pt.2) 5-Consolidate multi MDM solutions. 6-Coord w/other disciplines. #TDWI
34. Top 10 Priorities for NG #MDM (Pt.3) 7-Richer modeling. 8-Beyond enterprise data. 9-Workflow/process mgt. #TDWI
35. Top 10 Priorities for NG #MDM (Pt.4) 10-Retire early gen homegrown & build NexGen on vendor tool/app. #TDWI
FOR FURTHER STUDY:
For a more detailed discussion of Next Generation MDM – in a traditional publication! – see the TDWI Best Practices Report, titled “Next Generation Master Data Management,” which is available in a PDF file via a free download.
You can also register for and replay my TDWI Webinar, where I present the findings of the Next Generation MDM report.
Philip Russom is the research director for data management at TDWI. You can reach him at [email protected] or follow him as @prussom on Twitter.
Posted by Philip Russom, Ph.D. on April 13, 20120 comments
Blog by Philip Russom
Research Director for Data Management, TDWI
[NOTE -- I recently completed a TDWI Best Practices Report titled Next Generation Master Data Management. The goal is to help user organizations understand MDM lifecycle stages so they can better plan and manage them. TDWI will publish the 36-page report in a PDF file in early April 2012, and anyone will be able to download it from www.tdwi.org. In the meantime, I’ll provide some “sneak peeks” by blogging excerpts from the report. Here’s the fifth excerpt, which is the Executive Summary at the beginning of the report.]
EXECUTIVE SUMMARY Master data management (MDM) is one of the most widely adopted data management disciplines of recent years. That’s because the consensus-driven definitions of business entities and the consistent application of them across an enterprise are critical success factors for important cross-functional business activities, such as business intelligence (BI), complete views of customers, operational excellence, supply chain optimization, regulatory reporting, compliance, mergers and acquisitions, and treating data as an enterprise asset. Due to these compelling business reasons, many organizations have deployed their first or second generation of MDM solutions. The current challenge is to move on to the next generation.
Basic versus advanced MDM functions and architectures draw generational lines that users must now cross.
For example, some MDM programs focus on the customer data domain, and they need to move on to other domains, like products, financials, partners, employees, and locations. MDM for a single application (such as enterprise resource planning [ERP] or BI) is a safe and effective start, but the point of MDM is to share common definitions and reference data across multiple, diverse applications. Most MDM hubs support basic functions for the offline aggregation and standardization of reference data, whereas they should also support advanced functions for identity resolution, two-way data sync, real-time operation, and approval workflows for newly created master data. In parallel to these generational shifts in users’ practices, vendor products are evolving to support advanced MDM functions, multi-domain MDM applications, and collaborative governance environments.
Users invest in MDM to create complete views of business entities and to share data enterprisewide.
According to survey respondents, the top reasons for implementing an MDM solution are to enable complete views of key business entities (customers, products, employees, etc.) and to share data broadly but consistently across an enterprise. Other reasons concern the enhancement of BI, operational excellence, and compliance. Respondents also report that MDM is unlikely to succeed without strong sponsorship and governance, and MDM solutions need to scale up and to cope with data quality (DQ) issues, if they are to succeed over time.
“Customer” is, by far, the entity most often defined via MDM. This prominence makes sense, because conventional wisdom says that any effort to better understand or serve customers has some kind of business return that makes the effort worthwhile. Other common MDM entities are (in survey priority order) products, partners, locations, employees, and financials.
Users continue to mature their MDM solutions by moving to the next generation.
MDM maturity is good, in that 60% of organizations surveyed have already deployed MDM solutions, and over one-third practice multi-data-domain MDM today. On the downside, most MDM solutions today are totally or partially homegrown and/or hand coded. But on the upside, homegrown approaches will drop from 45% today to 5% within three years, while dedicated MDM application or tool usage will jump from 12% today to 47%. To achieve generational change, half of organizations anticipate replacing their current MDM platform(s) within five years.
The usage of most MDM features and functions will grow in MDM’s next generation. Over the next three years, we can expect the strongest growth among MDM features and functions for real-time, collaboration, data sync, tool use, and multistructured data. Good growth is also coming with MDM functions for workflow, analytics, federation, repositories, and event processing. Some MDM options will experience limited growth, because they are saturated (services, governance, quality) or outdated (batch processing and homegrown solutions).
This report helps user organizations understand all that MDM now offers, so they can successfully modernize and build up their best practices in master data management. To that end, it catalogs and discusses new user practices and technical functions for MDM, and it uses survey data to predict which MDM functions will grow most versus those that will decline—all to bring readers up to date so they can make informed decisions about the next generation of their MDM solutions.
=================================================
ANNOUNCEMENT Please attend the TDWI Webinar where I present the findings of my TDWI report Next Generation MDM, on April 10, 2012 Noon ET.
Register online for the Webinar.
Posted by Philip Russom, Ph.D. on March 30, 20120 comments
Blog by Philip Russom
Research Director for Data Management, TDWI
[
NOTE -- I recently completed a TDWI Best Practices Report titled Next Generation Master Data Management. The goal is to help user organizations understand MDM lifecycle stages so they can better plan and manage them. TDWI will publish the 40-page report in a PDF file on April 2, 2012, and anyone will be able to download it from www.tdwi.org. In the meantime, I’ll provide some “sneak peeks” by blogging excerpts from the report. Here’s the fourth excerpt, which is the ending of the report.]
The Top Ten Priorities for Next Generation MDM
The news in this report is a mix of good and bad. Half of the organizations interviewed and surveyed are mired in the early lifecycle stages of their MDM programs, unable to get over certain humps and mature into the next generation. On the flip side, the other half is well into the next generation, which proves it can be done.
To help more organizations safely navigate into next generation master data management, let’s list its top ten priorities, with a few comments why these need to replace similar early phase capabilities. Think of these priorities as recommendations, requirements, or rules that can guide user organizations into the next generation.
1. Multi-data-domain MDM. Many organizations apply MDM to the customer data domain alone, and they need to move on to other domains, like products, financials, and locations. Single-data-domain MDM is a barrier to correlating information across multiple domains.
2. Multi-department, multi-application MDM. MDM for a single application (such as ERP, CRM or BI) is a safe and effective start. But the point of MDM is to share data across multiple, diverse applications and the departments that depend on them. It’s important to overcome organizational boundaries if MDM is to move from being a local fix to being an infrastructure for sharing data as an enterprise asset.
3. Bidirectional MDM. Roach Motel MDM is when you extract reference data and aggregate it in a master database from which it never emerges (as with many BI and CRM systems). Unidirectional MDM is fine for profiling reference data. But bidirectional MDM is required to improve or author reference data in a central place, then publish it out to various applications.
4. Real-time MDM. The strongest trend in data management today (and BI/DW, too) is toward real-time operation as a complement to batch. Real-time is critical to verification, identity resolution, and the immediate distribution of new or updated reference data.
5. Consolidating multiple MDM solutions. How can you create a single view of the customer when you have multiple customer-domain MDM solutions? How can you correlate reference data across domains when the domains are treated in separate MDM solutions? For many organizations, next-generation MDM begins with a consolidation of multiple, siloed MDM solutions.
6. Coordination with other disciplines. To achieve next-generation goals, many organizations need to stop practicing MDM in a vacuum. Instead of MDM as merely a technical fix, it should also align with business goals for data. And MDM should be coordinated with related data management disciplines, especially DI and DQ. A program for data governance or stewardship can provide an effective collaborative process for such coordination.
7. Richer modeling. Reference data in the customer domain works fine with flat modeling, involving a simple (but very wide) record. However, other domains make little sense without a richer, hierarchical model, as with a chart of accounts in finance or a bill of material in manufacturing. Metrics and KPIs – so common in BI, today – rarely have proper master data in multidimensional models.
8. Beyond enterprise data. Despite the obsession with customer data that most MDM solutions suffer, almost none of them today incorporate data about customers from Web sites or social media. If you’re truly serious about MDM as an enabler for CRM, next-generation MDM (and CRM, too) must reach into every customer channel. In a related area, users need to start planning their strategy for MDM with big data and advanced analytics.
9. Workflow and process management. Too often, development and collaborative efforts in MDM are mostly ad hoc actions with little or no process. For an MDM program to scale up and grow, it needs workflow functionality that automates the proposal, review, and approval process for newly created or improved reference data. Vendor tools and dedicated applications for MDM now support workflows within the scope of their tools. For a broader scope, some users integrate MDM with business process management tools.
10. MDM solutions built atop vendor tools and platforms. Admittedly, many user organizations find that home-grown and hand-coded MDM solutions provide adequate business value and technical robustness. However, these are usually simple, siloed departmental solutions. User organizations should look into vendor tools and platforms for MDM and other data management disciplines when they need broader data sharing and more advanced functionality, such as real-time operation, two-way sync, identity resolution, event processing, service orientation, and process workflows or other collaborative functions.
================================
ANNOUNCEMENTS Although the above version of the top ten list is excerpted from the upcoming TDWI report on MDM, an earlier version of this list was developed in the TDWI blog “
Rules for the Next Generation of MDM.“
Be sure to visit
www.tdwi.org on April 2 or later, to download your own free copy of the complete TDWI report on Next Generation Master Data Management.
Please attend the TDWI Webinar where I present the findings of my TDWI report Next Generation MDM, on April 10, 2012 Noon ET.
Register online for the Webinar.
Posted by Philip Russom, Ph.D. on March 16, 20120 comments
Blog by Philip Russom
Research Director for Data Management, TDWI
I’ve just completed a TDWI Best Practices Report titled Next Generation Master Data Management. The goal is to help user organizations understand MDM lifecycle stages so they can better plan and manage them. TDWI will publish the 40-page report in a PDF file on April 2, 2012, and anyone will be able to download it from www.tdwi.org. In the meantime, I’ll provide some “sneak peeks” by blogging excerpts from the report. Here’s the third in a series of three excerpts. If you haven’t already, you should read the
first excerpt and the
second excerpt before continuing.
Technical Solutions for MDM An implementation of MDM can be complex, because reference data needs a lot of attention, as most data sets do. MDM solutions resemble data integration (DI) solutions (and are regularly mistaken for them), in that MDM extracts reference data from source systems, transforms it to normalized models that comply with internal MDM standards, and aggregates it into a master database where both technical and business people can profile it to reveal duplicates and non-compliant records. Depending on the architecture of an MDM solution, this database may also serve as an enterprise repository or system of record for so-called golden records and other persistent reference records. If the MDM solution supports a closed loop, records that are improved in the repository are synchronized back to the source systems from which they came. Reference data may also be outputted to downstream systems, like data warehouses or marketing campaign systems.
MDM solutions also resemble data quality (DQ) solutions, in that many data quality functions are applied to reference data. For example, “customer” is the business entity most often represented in reference data. Customer data is notorious for data quality problems that demand remediation, and customer reference data is almost as problematic. We’ve already mentioned deduplication and standardization. Other data quality functions are also applied to customer reference data (and sometimes other entities, too), including verification and data append. Luckily, most tools for MDM (and related disciplines such as data integration and data quality) can automate the detection and correction of anomalies in reference data. Development of this automation often entails the creation and maintenance of numerous “business rules,” which can be applied automatically by the software, once deployed.
================================
ANNOUNCEMENTS
Keep an eye out for another MDM blog, coming March 16. I’ll tweet, so you know when that blog is posted.
Please attend the TDWI Webinar where I will present the findings of my TDWI report Next Generation MDM, on April 10, 2012 Noon ET.
Register online for the Webinar.
Posted by Philip Russom, Ph.D. on March 2, 20120 comments