May 15, 2015
No matter the vintage or sophistication of your organization’s data
warehouse (DW) and the environment around it, it probably needs
to be modernized. DW modernization takes many forms. Common
scenarios range from software and hardware server upgrades to
the periodic addition of new data subjects, sources, tables, and
dimensions. As data types and data velocities continue to diversify,
many users are likewise diversifying their software portfolios to
include tools and data platforms built for new and big data. A few
organizations are even decommissioning current DW platforms to
replace them with modern ones optimized for today’s requirements
in big data, analytics, real time, and cost control. No matter what
modernization strategy is in play, all require significant adjustments
to the logical and systems architectures of the extended data
warehouse environment.
Most of the trends driving the need for data warehouse modernization boil down to four broad issues:
- Organizations demand business value from big data. In
other words, users are not content to merely manage big
data and other valuable data from new sources, such as Web
applications, machines, devices, social media, and the Internet
of things. Because big data and new data tend to be exotic in
structure and massive in volume, users need new platforms
that scale with all data types if they are to achieve business
value.
- The age of analytics is here. Many firms are aggressively
adopting a wide variety of analytic methods so they can
compete on analytics and understand evolving customers,
markets, and business processes. There is a movement from
“analyst intuition” and statistics to empirical data-science driven
insights. Furthermore, today’s consensus says that the
primary path to big data’s business value is through so-called
“advanced” forms of analytics, based on technologies for
mining, predictions, statistics, and natural language processing
(NLP). Each analytic technology has unique data requirements
and DWs must modernize to satisfy all of them.
- New challenges for real-time data. Technologies and
practices for real-time data have existed and been successfully
used for years. Yet, many organizations are behind in this area,
so it’s a priority for their data warehouse modernization efforts.
Even organizations that have succeeded with real-time data
warehousing and similar techniques will now need to refresh
their solutions so that real-time operations scale to exponential
data volumes, streams, and greater numbers of concurrent
users and applications. Furthermore, real-time technologies
must adapt to a wider range of data types, including schemafree
and evolving ones.
- Open source software (OSS) is now ensconced in data
warehousing. Ten years ago, Linux was the only OSS product
commonly found in the technology stack for DWs, BI, analytics,
and data management. Today, TDWI regularly encounters OSS
products for reporting, analytics, data integration, and big data
management. This is because OSS has reached a new level of
functional maturity while still being economically desirable.
A growing number of user organizations are eager to leverage
both characteristics.
To help user organizations prepare, this TDWI Checklist Report
canvasses eight of the leading DW modernization scenarios,
discussing many of the new product types, functionality, and
user best practices (as well as the business case and technology
strengths) of each.