Three Altruistic Goals for Data Integration
These three goals can help DI specialists (and the people who work closely with them) stay focused on the noblest purposes and greatest value propositions of DI
- By Philip Russom, Ph.D.
- May 6, 2010
Certain characterizations of data integration (DI) seem unfortunate to me. For example, people regularly talk about DI as a form of “plumbing.” The reason this analogy rubs me the wrong way is that water deteriorates as it turns from tap water into sewage, whereas data improves as it moves through DI infrastructure. Likewise, DI is regularly characterized as mere data movement or as a technology practice far removed from business impact.
To help us all get beyond characterizations of DI that are inaccurate (and sometimes demeaning), allow me to point out three of DI’s altruistic goals. It’s my hope that these goals can help DI specialists and the people who work closely with them stay focused on the noblest purposes and greatest value propositions of DI.
Goal #1: Data integration is the repurposing of data via transformation
On a base level, any DI job will access data through an interface and extract a copy of some of that data, but that’s not the defining characteristic of DI. In fact, moving data is just a means to an end. Yet, DI is often characterized as mostly about moving data.
True integration is about transforming data -- as the T in ETL so powerfully proclaims. The transformation can be simple, as when replication reorders the fields of a record to fit a table in a target database or when a table join changes source schema into a common one so the tables can be combined. A transformation may also be complex, as seen in the multidimensional structures of a data warehouse.
(Commentary continues below)
Organizational Agility: How Business Can Survive and Thrive in Turbulent Times
TDWI Best Practices Report: Unified Data Management
Data Federation: Expanding the Data Integration Toolbox (June 9)
Online Customer Analytics and Beyond: Understanding Customer Behavior (May 26)
As you can see from these cases, DI is defined by its transformation of data, whereas accessing and copying data is coincidental.
Furthermore, data transformation is a technical task that supports a business goal -- namely, repurposing data for a business use that differs from its original, intended use.
DI’s true calling is to transform data to make it as suited as possible to a given business purpose. To keep this value proposition in focus, we should all avoid characterizations that stress the coincidental movement of data.
Goal #2: Data integration is a value-adding process
Think about how a manufacturing process consumes material in various states of rawness or completeness, processes the material to make it suited to a new purpose, then combines it into a product that’s far greater than a sum of its parts. The data transformation and repurposing mentioned in the previous goal have an effect similar to manufacturing, in that something truly new (and usually more valuable) results.
For example, the calculated values, aggregates, and dimensional models of a data warehouse manage high-value data that doesn’t exist elsewhere -- not even in the source IT systems that provided raw material for DI. The complete view of a customer affected by data sync or the quick snapshot across multiple applications produced by data federation are likewise unique datasets that provide a higher value (or, at least, a different value) than data in its original, disparate state.
Furthermore, DI and data quality are progressively invoked in the same data processing job to further raise the quality of data being integrated. Hence, data integration specialists should always raise the bar by looking for ways to add further value to data as they integrate and repurpose it.
Goal #3: A data integration solution should be the product of collaboration
In fact, collaboration with a variety of technical and business people is an increasing part of DI specialists’ workloads because as the amount and breadth of DI work increases, so does the number of DI specialists in a user organization.
DI work is increasing coordinated closely with similar efforts in data quality and master data management to produce the highest quality data possible. An increasing number of business people are helping decide which data should be integrated and how, following the best practices of data stewardship established by data quality initiatives.
The governance and compliant use of data is not just for operational applications and their users; these also extend to data management practices such as like DI. Hence, some DI specialists must collaborate with governance and compliance boards to determine how their policies apply to DI practices. Put all these together, and you see that a modern approach to DI involves considerable coordination and collaboration with a variety of technical and business teams.
Philip Russom is senior manager of TDWI Research. Philip can be reached at firstname.lastname@example.org .
Data Federation: Expanding the Data Integration Toolbox
June 9, 2010
Speaker: Wayne Eckerson
Online Customer Analytics and Beyond: Understanding Customer Behavior
May 26, 2010
Speaker: Mark Madsen