Non-data-warehouse ETL Usage Growing
Increasingly, ETL is being tapped to support non-data warehousing activities, such as database consolidations and migrations
- By Stephen Swoyer
- January 12, 2005
Historically, ETL and data warehousing have been two peas in a pod. But that’s changing, says Phil Russom of Forrester Research: Increasingly, ETL is being tapped to support non-data warehousing activities, such as database consolidations and migrations. In fact, you can expect non-data warehousing use of ETL to grow even more in 2005.
The lion’s share of ETL use still occurs in conjunction with data warehousing, Russom says. But right now, one-fifth of ETL use-case scenarios involve non-data-warehousing activities.
“Consistently, data warehouse … usage has constituted the lion’s share, with nondata-warehouse … usage a small minority,” Russom writes. “The mix has shifted slowly over the years, accelerating this decade with the minoritypercentage [of non-data-warehouse usage] increasing to almost 20 percent.”
Russom says non-data-warehouse ETL usage will continue to grow. He bases his conclusions on a recent Forrester survey of 28 ETL users. In response to a question concerning the percentage split between their data warehouse and non-data-warehouse ETL usage, for example, 81 percent indicated that they used ETL exclusively for data warehouse usage;19 percent cited non-data-warehouse ETL usage.
But there’s more to this than meets the eye, says Russom. “If we instead count organizations, 15 out of 28 had some kind of non-DW usage—roughly half,” he writes. “Two organizations reported 100 percent non-DW usage, showing that ETL needn’t be associated with data warehousing at all. And two other organizations reported non-DW usage as 50 and 65 percent respectively.”
So for what purposes other than data warehousing are organizations commonly tapping ETL? For starters, to move data between and among applications.
“At 9.48 percent of all ETL use, this is the largest category of non-DW ETL use. ETL may support data propagation, where data moves one way from a system of record to others,” Russom indicates. “ETL could support data synchronization, where data moves two ways among systems where it is redundant. This is common in financial services companies where customer data may be entered or changed in multiple systems.”
Elsewhere, ETL is tapped to support customer data integration (1.07 percent) and database migration (3.04 percent) scenarios, for a combined 4 percent of use cases. The latter scenario doesn’t typically happen on a recurring basis—although there are exceptions, says Russom. “Although the usage described here is usually a one-time project, some ETL applications may consolidate data from multiple databases on a recurring basis, as is typical of various financial consolidations,” he writes.
Miscellaneous cases of non-data-warehouse ETL usage abound. “[S]ome companies use ETL just to transform data and documents from an industry standard [such as HIPAA or EDI] to an internal format and vice-versa,” Russom confirms. “Others apply ETL to creating reference data, managing master data, generating test data from production data, caching data to improve application performance, persisting data for customer self-service, integrating data as part of a fulfillment process, collaborating through data with partners, importing third-party data into corporation systems, and so on.”