Hadoop Integration Benchmark: Product Profile and Evaluation - Talend and Informatica
February 19, 2016
Hadoop has become increasingly prevalent over the past decade on the information management landscape by empowering digital strategies with large-scale possibilities. While Hadoop helps data-driven (or those who desire to be) organizations overcome obstacles of storage and processing scale and cost in dealing with ever-increasing volumes of data, it does not (in and of itself) solve the challenge of high performance data integration—an activity that consumes an estimated 80% or more of time in getting actionable insights from big data.
Modern data integration tools were built in a world full of structured data, relational databases, and data warehouses. The big data and Hadoop paradigm shift has changed and disrupted some of the ways we derive business value from data. Unfortunately, the data integration tool landscape has lagged behind in this shift. Early adopters of big data for their enterprise architecture have only recently found some variety and choices in data integration tools and capabilities to accompany their increased data storage capabilities.
With many vendors throwing their hat in the big data arena, it will be increasingly challenging to identify and select the right/best tool. The key differentiators to watch will be the depth by which a tool leverages Hadoop and the performance of integration jobs. This white paper evaluates two leading integration vendors, Talend and Informatica, and benchmarks their performance against these criteria.