TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Think
- Research & Resources
  - TDWI Playbook | Next Generation Data Science: The AI-Driven Data Science Life Cycle
  - TDWI Data Points | The Data Foundation for AI
  - TDWI Best Practices Report | Data Strategies and Foundations for Modern Data Management
  - TDWI Insight Accelerator | Adopting a Platform Approach for Gaining Insights from Unstructured Data
- Webinars
  - Modernize and Govern: Unifying Your Data Strategy July 10, 2025
  - Powering Data Science with AI-Driven Tools and Practices July 15, 2025
  - Smarter Marketing in Retail: How AI and Modern Data Foundation Drive Growth July 17, 2025
  - Data Integration for AI: Overcoming Modern Pipeline Challenges July 23, 2025
- Virtual Summits
  - Virtual Events Keys to Making Your Data AI Ready September 10, 2025
  - Virtual Events Data Quality for BI, Analytics and AI October 22, 2025
  - Virtual Events Modern Data Strategy November 12, 2025
  - Virtual Events What’s Ahead in 2026 for Data & Analytics December 10, 2025
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Speaking of Data Podcast
  
  Current Research Surveys
Train
- In-Person Events
  - Conference TDWI Transform 2025 San Diego August 18, 2025
  - Executive Summit TDWI Modern Data Leader's Summit San Diego: AI in the Enterprise August 18, 2025
  - Executive Summit AI Accelerate 2025, Brought to You by AI Boadroom & TDWI August 18, 2025
  - Conference TDWI Transform 2025 Orlando November 16, 2025
- Virtual Live Seminars
  - TDWI Data Governance Principles and Practices: Managing Data as an Asset June 25, 2025
  - Building Your Company’s Data Governance Roadmap June 25, 2025
  - Data Governance: Driving Engagement and Organizational Change June 26, 2025
  - A Framework for Modern Data Governance June 25, 2025
- Online Learning
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Train Your TeamCustom solutions for training your team
  
  Get CertifiedEarn a professional credential in BI and Analytics, Data Governance, or AI
  
  TDWI MembershipExclusive access to the research, tools, training, and connections
Engage
- Connect
  - Connect and Contribute to Our Vibrant Community of Data Leaders
    
    Subscribe to TDWI Stay up to date on the latest news and events. Sign Up
    
    Become a TDWI Member Gain exclusive access to the research, tools, training, and connections to move your careers, teams, and projects forward. Learn More
    
    Become a Part of the TDWI Research Panel Make a difference in the data and analytics industry and earn incentives by sharing your insights with TDWI. Explore Now
    
    Speak at TDWI Events Share your expertise and build your personal brand as a speaker at a TDWI In-Person or Virtual Event. Submit a Proposal
    
    Become a TDWI Research Fellow Apply to be a member of TDWI’s industry leading research team. Apply Today
    
    Become a Member of the Data & AI Leaders Forum Engage in collaborative discussions, stay ahead of the curve, and stay in the know. Apply Now
    
    Showcase Your Data & AI Solutions Reach and engage with TDWI community through multi-channel marketing programs. Learn More

RESEARCH & RESOURCES

LESSON - Third-Generation ETL: Delivering the Best Performance

October 13, 2005

By Yves de Montcheuil, Director of Product Marketing, Sunopsis

As computer systems started to evolve from monolithic mainframes to distributed computing systems, and as business intelligence made its debut, the first ETL (extract, transform, load) solutions were introduced. Since that time, several generations of ETL have been produced.

First Generation: The Origin of ETL and the Legacy Code Generators

Original data integration tools generated native code for the operating system of the platform on which the data integration processes were to run. Most of these products actually generated COBOL, since at that time data was largely stored on mainframes. These products made the data integration processes easier than they had been by taking advantage of a centralized tool to generate data integration processes and by propagating the code to the appropriate platforms—instead of manually writing programs to do so. Performance was very good because of the inherent performance of native compiled code, but these tools required an in-depth knowledge of programming on the different platforms. Maintenance was also difficult because the code was disseminated to different platforms and differed with the type of sources and target.

Second Generation: The Proprietary ETL Engines

Next came the second generation of ETL tools, which are based on a proprietary engine that runs all the transformation processes. This approach solved the problem of having to use different languages on different platforms, and required expertise in only one programming language: the language of the ETL tool itself. However, a new problem arose: the proprietary engine performing all the transformations became a bottleneck in the transformation process. All data, coming from various sources to go to the target, had to pass through an engine that processed data transformations row by row—a very slow approach when dealing with significant volumes of data.

Third Generation: The E-L-T (Extract, Load, Transform) Architecture

Addressing the challenges faced by tools from the previous two generations while leveraging their respective strengths, a new generation of ETL tools recently appeared. Since the inception of the previous generation—proprietary engines—database vendors have invested considerable resources to greatly improve the capabilities of their SQL languages. By leveraging these improvements, they have made it possible for an ETL tool to generate and execute highly optimized data integration processes, driven by the native SQL (or other languages) of the databases involved in these processes.

This third generation—E-L-T architecture—provides a highly graphical environment, along with the ability to generate native SQL to execute data transformations on the data warehouse server. This new approach has several clearly identifiable advantages:

It eliminates the ETL hub server sitting between sources and target, which was introduced by the second generation of ETL products.
Using an RDBMS to execute data transformations allows bulk processing of the data. Bulk is up to 1,000 times faster than row-by-row data processing. The larger the volume of data to manage, the more important the bulk processing becomes.
It also provides for better performance than any other solution, because transformations are executed by the RDBMS engine in bulk instead of row by row with second-generation ETL. In addition, the large database vendors—Oracle, IBM, Microsoft, Sybase, and so on—have had significantly more resources and time to invest into improving the performance of their engines than have the vendors of second-generation ETL software. Relying on the RDBMS engines provides a way to leverage these investments.
In the E-L-T architecture, all database engines can potentially participate in a transformation—thus running each part of the process where it is the most optimized. Any RDBMS can be an engine, and it may make sense to distribute the SQL code among sources and target to achieve the best performance. For example, a join between two large tables may be done on the source.
A centralized design tool makes programming and maintenance easy, allowing developers to control which database engine processes which piece of information in which way.

Today’s RDBMSs have the power to perform any data integration work. Third-generation E-L-T tools take advantage of this power by leveraging and orchestrating the work of these systems—and processing all data transformations in bulk.

This article originally appeared in the issue of .

TDWI Membership

Get immediate access to training discounts, video library, research, and more.

Find the right level of Membership for you.

Learn More

↑

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders