May 5, 2011
32 Facts about Next-Generation Data Integration
Topic: Data Management
To raise an awareness of the new tool features, user techniques, and team structures of Next Generation Data Integration (NGDI), I recently issued a series of 32 tweets via Twitter, over a three-week period. Most of these tweets triggered responses to me or retweets, so I seem to have reached the DI audience I was looking for -- or touched a nerve.
To help you better understand NGDI and why you should care about it, I'd like to share some of the thoughts from these tweets with you. I think you'll find them interesting because they are a compact study of NGDI in compact form yet they're amazingly comprehensive.
Every tweet I wrote was a short sound bite drawn from TDWI's recent report on NGDI, which I researched and wrote. Many of the tweets focus on a statistic cited in the report, while other tweets are definitions stated in the report.
I've removed the arcane acronyms, abbreviations, and incomplete sentences typical of tweets to make this more comprehensible to any non-Tweeters. I issued the original tweets in groups on related topics. I've added headings here to show that organization.
Properties of Next Generation Data Integration
1. NGDI has advanced and broadened the state DI has evolved into.
2. It's about choosing the right DI options from dozens now available.
3. NGDI goes beyond ETL as well as ELT, federation/virtualization, replication, change data capture (CDC), data quality (DQ), master data management (MDM), and data services.
4. It goes beyond DW as well as data migration, consolidation, synchronization, and B2B DI.
5. It is an autonomous discipline, no longer a dark corner of DW or DBA work.
6. NGDI includes collaboration with large DI teams and with other data teams and business management.
7. Uses are many interfaces, old (ODBC, APIs) and new (Web and data services, message or service bus).
8. NGDI includes real architecture, not a bucket of interfaces, transforms, and flows.
9. Why should you care about NGDI? Data integration is growing fast; it is becoming the IT infrastructure.
The State of Data Integration Disciplines
10. Ninety-five percent of surveyed orgs use ETL. For 75 percent, ETL is the highest data-integration priority.
11. ELT is the second data-integration priority.
12. Replication and data synchronization together are the third priority of data integration; 45 percent of respondents use these.
13. Forty percent of respondents perform NGDI over a message or service bus. That's surprisingly advanced!
14. Data federation is finally ensconced as a data integration technique; 30 percent do it.
15. Twenty percent of surveyed organizations do event processing in data integration solution.
Data Integration Tool Portfolios
16. Nineteen percent of respondents want to simplify NGDI tool portfolios and want fewer but broader tools from fewer vendors.
17. When it comes to integrated suites of NGDI tools, 9 percent are using them today; 42 percent expect to use them in three years.
18. Only 18 percent of respondents confess to using hand-coding as a primary data integration technique.
19. About 40 percent of data integration tool functions used today, rising to about 65 percent in three years.
DI Tool and Platform Replacements
20. One-third of respondents plan DI tool replacement in 2011-2012. Two-thirds won't replace them.
21. Why replace an NGDI tool? Respondents wanted an integrated suite, a more business-friendly tool, scalability, real-time support, better services, or high availability.
Data Types Handled via Data Integration Solutions
22. Data integration today handles structured data (according to 99 percent of respondents), hierarchic or legacy (84 percent), and semi-structured data (62 percent).
23. Semi-structured data (mostly XML) will jump from 62 percent today to 87 percent in three years. Expect XML in your NGDI solution!
24. Data types set for greatest NGDI growth: event, spatial, and unstructured. Each will grow from little use now to over 90 percent in three years.
25. Greatest NGDI growth: event, spatial, and unstructured data. Each are little used now but will grow to over 90 percent in three years.
Trends for Teams of Data Integration Specialists
26. The average number of data integration specialists per organization ranges from 13.1 to 16.4.
27. Data integration specialists work on a BI/DW team (59 percent), DBA/archicture team (39 percent), in central IT (37 percent), or on an independent DI team (30 percent).
28. NGDI work is coordinated with BI/DW, application integration, data architecture, modeling, government, data quality, master data management, in that order.
Growth and Decline among NGDI Disciplines
29. The fastest growing NGDI discipline is (drum roll, please!) master data management (MDM).
30. After MDM, NGDI growth is in real-time data quality, real-time data integration, data governance, complex event processing, tools for business users, metadata, text analytics, in-memory DI, data quality, and federation/virtualization.
31. Disciplines that will grow from little use today to big use in DI in three years: software-as-a-service, cloud computing, open source, and Hadoop.
32. Expect the greatest declines as we enter the age of NGDI to occur in ETL, batch processing, and hand coding.
For Further Reading
For a more detailed discussion of NGDI -- in a traditional publication -- see the TDWI Best Practices Report, Next Generation Data Integration, available in a PDF file via a free download.
You can also register for and replay the TDWI Webinar on the same subject.
If you want to find the tweets on Twitter.com, search for hash tags #NGDI , #TDWI, and #DataIntegration.
Philip Russom is the senior manager of TDWI Research at The Data Warehousing Institute (TDWI). Philip can be reached at firstname.lastname@example.org or prussom on Twitter.
Copyright 2011. TDWI. All rights reserved.