RESEARCH & RESOURCES

EXCERPT - Introduction to Next Generation Data Warehouse Platforms

By Philip Russom

If you’re a data warehouse professional—or you work closely with one—you’ve probably noticed the many new options for data warehouse platforms that have appeared this decade.

We’ve seen the emergence of new categories of data warehouse (DW) platforms, such as data warehouse appliances and software appliances. A new interest in columnar databases has led to several new vendor products and renewed interest in older ones. Open source Linux is now common in data warehousing, and open source databases, data integration tools, and reporting platforms have come out of nowhere to establish a firm foothold. In the hardware realm, 64-bit computing has enabled larger in-memory data caches, and more vendors now offer MPP architectures. Leading database vendors have added more features and products conducive to data warehousing.

Those are mostly features within the data warehouse platform, especially its database. There are also growing practices that are demanding support from the platform, including real-time integration between the data warehouse platform and operational applications, various types of advanced analytics, and reusable interfaces exposed through Web services or service-oriented architecture (SOA). Furthermore, a number of data warehouse platforms and other business intelligence platforms are now readily available through software-as-a-service (SaaS) and cloud computing.

The good news is that the options for data warehouse platforms have recently become far more numerous. The bad news is that it’s difficult for data warehouse professionals and their business sponsors to keep track of these advancements and select the ones that are appropriate for their needs.

To help organizations understand the many new options available to them, this report catalogs the new data warehouse platform products, features, and techniques that have appeared this decade, plus notable advances in more established data warehouse platforms. As examples, the report mentions many vendors and their products. From the survey data cited here, you’ll see that many organizations are planning the next generation of their data warehouse, and this report provides information that can be instrumental for such planning. The focus is on technology, but this report also explains how technology’s adoption in next generation data warehouse platforms is driven by real-world business and organizational needs and requirements.

Definitions of Terms and Concepts
DATA WAREHOUSE PLATFORM

For the purposes of this report, a data warehouse platform consists of one or more hardware servers, an operating system, a database management system (DBMS), and data storage. These communicate via a LAN or WAN, although a multi-node data warehouse platform may have its own specialized network. Note that a data warehouse platform manages a data warehouse, defined as a collection of metadata, data model, and data content, designed for the purposes of reporting, analyzing information, and making decisions. But the data warehouse is not part of the platform per se. (See Figure 1.) All these components and more have seen generational advances in recent years.

GENERATIONS OF DATA WAREHOUSES

TDWI’s position is that certain relatively new technologies, techniques, and business practices are driving the majority of data warehouses and their platforms toward a redesign, major retrofit, or even replacement that we can recognize as a generation. TDWI takes the term literally, meaning that the current generation of a data warehouse will beget the next generation. In many cases, generational change is an evolutionary process that adapts the resulting data warehouse to changing business and technology requirements. In fact, generational change is often driven by these requirements, as is explained in detail in the next section of this report. In other cases, generational change is more of a maturation process that steps a data warehouse through multiple stages of a lifecycle.

NEXT GENERATION DATA WAREHOUSE PLATFORMS

What’s next for a given organization’s data warehouse platform can vary tremendously. For example, a next generation data warehouse platform may tap into leading-edge features, such as appliances, open source, and cloud computing. It may simply get you caught up with somewhat more established practices for real-time operation, advanced analytics, and services. Sometimes, the next generation addresses administrative issues, such as hardware upgrades(from 32-bit to 64-bit), data migrations (from one DBMS to another) or architectural changes (from SMP to MPP). So, let’s keep in mind that a next generation data warehouse platform is a relative concept, because it depends on where you’re starting, what new requirements you must address, and how many resources you have.

Click to view larger

WHY CARE ABOUT DATA WAREHOUSE PLATFORMS NOW?
  • Businesses face change more often than ever before. Recent history has seen businesses repeatedly adjusting to boomand- bust economies, a recession, financial crises, and shifts in global dynamics or competitive pressures. Increasingly, businesses rely on the data warehouse and related business intelligence infrastructure to understand change and react appropriately.
  • DW platforms need updating to support changing business requirements. In fact, many of the technologies associated with the next generation DW relate to change in some way, such as advanced analytics, scalable architectures, virtualization methods, reusable services, real-time integration with operational applications, and so on.
  • Successful DWs mature through multiple lifecycle stages. This usually provokes changes in the underlying DW platform and elsewhere in the business intelligence (BI) infrastructure.
  • There’s probably a new generation in your near future. TDWI survey data shows that almost half of respondents are planning a data warehouse platform replacement in 2009–2012. Many others anticipate keeping their current platforms, but updating them significantly.
USER STORY: MANAGEMENT REQUIREMENTS OF TEN DICTATE THE DESIGN OF A NEXT GENERATION DW AND ITS PLATFORM.

“We pulled together our current data warehouse a couple of years ago,” said Karl Mikula, the data and BI manager at Hagerty Insurance Agency, America’s leading provider of products and services for collectors of classic cars and boats. “Now that the company sees the value, we’re building our next generation data warehouse and BI solution atop a platform that’ll do what the company needs. In a nutshell, upper management wants to adapt a performance management methodology with scorecards. And they want self-service BI, where they can search a repository and pull data into reports or spreadsheets of their own design, presented through a corporate portal. To support this, we’re designing a data warehouse that stores metrics and KPIs in a searchable repository. For the next generation platform, we have a database management system, a data integration tool, a reporting tool, a search engine, and an enterprise portal. All these come from Microsoft, and they’re all tightly integrated out of the box.”


Philip Russom is the senior manager of TDWI Research at The Data Warehousing Institute, where he oversees many of TDWI’s research-oriented publications, services, and events. He can be reached at prussom@tdwi.org.

This article was excerpted from the full, 32-page report, Next Generation Data Warehouse Platforms. You can download this and other TDWI Research free of charge at tdwi.org/research/reportseries.

The report was sponsored by Aster Data Systems, HP, IBM, Infobright, Kognitio, Microsoft, Oracle/Intel, Sybase, and Teradata.

This article originally appeared in the issue of .

TDWI Membership

Get immediate access to training discounts, video library, BI Teams, Skills, Budget Report, and more

Individual, Student, & Team memberships available.