TDWI Experts in: Data Warehousing

6/3/2010

  • Experts Agile DW Environment

    Data warehouses were once considered a major undertaking that involved a long-term implementation effort and significant resources. Now they often need to be quickly deployed so that organizations can act upon new opportunities, identify cost savings, incorporate decision-making capabilities within operational systems, and, in general, achieve or enhance their competitive advantages.

    According to Merriam-Webster's Collegiate Dictionary, the definitions of agile include "having a quick resourceful and adaptable character." Although Merriam-Webster uses "an agile mind" as an example, I believe the definition also describes an agile data warehousing environment that enables users to quickly perform analyses in support of their decision-making processes. Furthermore, it also describes decision-making algorithms embedded in operational systems (e.g., customized offerings of additional products to Web-site purchasers or checking names against a terrorist watch list). This article focuses on ways to create a more agile data warehouse environment that facilitates end-user decision making.

    Almost all of us have experienced situations where users wanted to augment their data warehouses with another data source, but by the time a decision was made to include this data, the deadline for acting on the opportunity had long passed. In these situations, typified by one-time "do or don't do" decisions, a 75 percent solution today is worth far more than a 100 percent solution only available after the fact.

    This is not to say that any and all data should freely be sourced into an enterprise data warehouse (EDW). Rather, alternatives should be available to support decisions that require data not available from the EDW. Although the integrity of the data within an EDW should be sacrosanct, compromises can often be made in support of one-time, or limited-scope (e.g., departmental) analysis efforts. As long as users are aware of the potential data quality compromises, this can be done using special-purpose data marts, data warehouse appliances, analytic databases, or enterprise information integration (EII) implementations that directly access data from operational systems.

    Even if the source data is available within the EDW, it may be more expedient to move the data to another platform for analysis. This could occur, for example, if the EDW is running at full capacity or another platform (such as a special-purpose appliance) could greatly outperform the EDW.

    To create an agile environment, organizations must make it easier to deploy analysis tools and to quickly enhance their underlying data warehousing infrastructure. Fortunately there are a variety of technologies and techniques that facilitate this including the following.

    Cloud computing and third-party hosting of on-demand computing power, associated storage, and business intelligence tools allow organizations to quickly deploy a data warehouse platform with minimum upfront costs or long-term commitments. It could be used as an adjunct to the existing in-house environment to expand overall analysis capabilities on an "as-needed" basis. Several vendors also market customer-centric data through an on-demand channel.

    Open Source software including data integration, database, and business intelligence offerings permit organizations to economically obtain these capabilities without necessarily increasing their license fees for proprietary software used in "mission-critical" enterprise data warehouse implementations.

    Analytic (e.g. column-oriented) databases are optimized for analyzing large quantities of data and can dramatically reduce query run-times.

    Virtualization techniques can leverage existing hardware and reduce or eliminate the need to acquire additional computing power.

    Organizational bureaucracy can be reduced by establishing or modifying procedures associated with special, one-time requests. For example, although a strong change management process is important for an enterprise data warehouse, it may be overkill for one-time analysis efforts. Along the same lines, the use of software from vendors not on the "approved vendor" list (including perhaps open source vendors) might also be allowed, at least as long as the one-time request does not evolve into a regularly run production application.

    Reduce dependence on IT by creating parameter-driven report and analysis templates that can be used by less-skilled employees. Organizations should also consider establishing a vehicle for generalizing (i.e., converting them to parameter-driven requests) valuable but somewhat specific BI analyses so that they can have applicability to other situations.

    Enhanced search and delivery capabilities allow users to find relevant templates and existing analyses and deliver results directly to their mobile devices. Similar to reducing dependence on IT, this is applicable to almost all platforms within an organization's overall data warehousing environment.

    The Final Word

    An agile data warehousing environment requires an overall architecture that supports the integrity of enterprise data warehouses while still providing the flexibility to quickly react to special requests. To accomplish this, an organization should establish a flexible data warehousing architecture that may include both in-house and on-demand capabilities. They should also consider relaxing the change management process and deploying software tools that, though perhaps not appropriate for enterprise-wide applications, are well-suited for special purpose analyses needs. Strong search and delivery capabilities will further serve to make the data warehousing environment more agile.

    Michael A. Schiff is a principal consultant for MAS Strategies. He can be reached at [email protected]

Vendor Q&A

  • IBM

    Q. What major trend should BI professionals pay attention to that you believe is being ignored today?

    A. The need for speed, or fast time-to-value, is driving a growing interest in full-stack vendor solutions. With rapid globalization and the velocity of business decisions needed to keep pace, IT no longer has the luxury to purchase multi-vendor products in an attempt to get the perceived best-of-breed piece/part. Installing, configuring, and testing a variety of hardware and software products so they work together efficiently requires a significant investment in skilled resources across all products, and a timeframe that is becoming less acceptable. This is simply too slow, too resource intensive, too costly, and too risky.

    The solution is workload-optimized analytics and business intelligence systems with software, hardware, and services that are pre-configured, pre-tested, and pre- optimized for fast time-to-value on analytical workloads. Full-stack vendors can perform common tasks across all implementations, such as determining what hardware and software solutions will work best together, matching versions and releases, doing the installations, pre- configuring, and initial testing, so that end customers do not have to. Customers still need to customize their solutions to fit their own unique data and content peculiarities, legacy infrastructure, and business objectives.

    Workload-optimized systems take much of the mundane chores away from the end customer, leaving them free to focus on solving their most pressing business issues. Faster implementations, faster time-to-value, and systems that are optimized for their unique workloads, means that business users can start analyzing trusted information more quickly, making smarter and more confident decisions faster, and moving their business forward.

    Answer supplied by Larry Heathcote, WW GTM Program Director, IBM

Related BI Resources

Education

About TDWI Experts

TDWI Experts is a twice-monthly e-newsletter where BI/DW thought leaders share opinions and ommentary about relevant industry topics and the latest technologies.