There was a time when a data warehouse architecture consisted of a chain of databases all running on one or two machines in our own data centers. Handwritten ETL programs were used to copy and transform data from one database to another. Now, so much new technology offering innovative opportunities has become available, there are so many new BI requirements, and we have new ways to design our data warehouse architectures. Data warehouse architects are struggling with all these new developments. They have to find answers for an almost endless list of questions. Should the data warehouse be developed with Hadoop? Do we still need data marts if the BI tools read data into memory? Can we use Spark as query performance booster? What does it mean to design data-vault-based data warehouses? How does data streaming and the IoT work together with the data warehouse? Should we move the entire architecture into the cloud? Can we replace the data warehouse with a data lake? What is the role of the logical data warehouse? Will an analytical SQL database server solve all our query performance problems? And so on, and so on.
This tutorial discusses all the architectural and technical developments. How are they interrelated? How can you migrate to a modern architecture? What are the pros and cons of all these developments?
You Will Learn
- What are the use cases of Hadoop and Spark in a data warehouse architecture?
- To distinguish between five levels of BI in the cloud and how they differ.
- What the advantages are of using data vault as design technique.
- Whether data warehouse automation is a hype or reality.
- How Spark can be used to boost query performance and may even replace data marts.
- How a logical data warehouse and virtual data lake can work together.
- How analysis of streaming data can be embedded in a more classic architecture.
- Why operational BI demands a new architecture.
- Business intelligence specialists and data warehouse designers who want to know about all the new developments.
- Data scientists, data analysts, and business analysts who use and work with data every day and who want to know which of these developments may help them.
- Technology planners, technical architects, and enterprise architects who need to know how to evaluate all these new developments on their technical merits.
- Database developers and database administrators who need to know what the impact of Hadoop and Spark is on database aspects.
- IT Managers who need to be informed about all these new developments to see what the potential business benefits are.