This session is broken up into three topics:
- Why would you want to move the existing data warehouse?
- Technology challenges and functionality gap
- People retooling and development process changes
For each of these topics, a summary of lessons learned will be shared. This presentation is designed to explain the conceptual difference between data warehousing in RDBMS and Hadoop in a way that is easy to understand. The content is based on three years of hands-on experience slowly retiring a legacy Oracle Data Warehouse and building a new one in Cloudera Hadoop at a multibillion-dollar global corporation.
The business objectives, technology drivers, and management’s role in making the decision to move from RDBMS to Hadoop are covered in the first topic—providing insight into the right and “not so right” reasons for this shift. It will also highlight the pros and cons of revolutionary change versus evolutionary change.
During the second topic, the fundamental technological differences between an RDBMS platform and a Hadoop platform are highlighted where they have a contrasting nature or an outright gap. Challenges within ETL, scheduling, metadata, performance monitoring, and tuning tools that typically work with the DW platform are also covered.
The third topic deals with the real impact to people, organizational politics, and skills gap that impacts the development process in Hadoop and how delivery and business value can run into challenges