The Premier Website for Data Warehousing and Business Intelligence

Current Monograph

Who Ensures Clean, Consistent Data?
(Hint: It’s Not Just the IT Department!)

September 2009

In 2002, TDWI estimated that inaccuratecustomer data costs U.S. businesses a staggering $611 billion a year in postage,printing, and staff overhead. Frighteningly, the real cost of bad data is higher.Data problems can alienate customers, create revenue and cost leaks, undermineprocess efficiency, delay expensive projects, and expose an organization tocompliance risks. In short, bad data can make it hard for the business to achieveits financial and strategic goals.

Despite the risks, many business executives don’t understand the high costs of bad data. If they are aware of the problems, they don’t know what steps to take to resolve them. IT departments feel helpless to resolve the issue without strong executive sponsorship and funding. Compounding the problem, there is a communications barrier between the two groups.

The reality is that the business runs on data; it’s like fuel for the corporate engine. Without good data, a company can’t possibly understand its customers, suppliers, or competitors—or its own people, processes, and performance. Therefore, it’s imperative that business and IT find common ground and work together to ensure high-quality data.

Download now


Enterprise Information Management:
In Support of Operational, Analytic, and Governance Initiatives

March 2009

In most organizations today, data and other information is managed in isolated silos by independent teams using diverse information management tools for data integration, quality, profiling, federation, meta- and master data management, and so on. However, there’s a trend toward enterprise information management (EIM), a practice that holistically coordinates teams and integrates tools. Through team collaboration and tool interoperability, EIM seeks to improve the “four Cs,” namely the completeness, cleanliness, consistency, and currency of structured and unstructured data.

The four Cs are worthy goals from a technology viewpoint, and they prepare data for the next step, which is to share and leverage information across multiple business units of an organization and with business or trading partners. But getting there demands a number of adjustments to current technology solutions and business processes.

Download now


Closing the Loopin a Consolidated World:
Evaluating Packaged Analytic Applications

January 2009

Analytic applications are the business complement to enterprise applications, although they have always taken a backseat to their operational brethren until recently. After years of neglect, analytic applications are finally getting the attention they deserve. Numerous CIO studies in the past three years show business intelligence—another moniker for analytic applications—at the top of CIOs’ purchasing lists. The reason is pretty straightforward: companies have amassed lots of information about their business and processes (thanks to packaged enterprise applications), but have few tools to mine that data for insights that can lead to smarter decisions, better plans, and more empowered workers. Whereas operational applications provide information that helps a company run more efficiently, analytic applications provide insight that helps a company run more effectively.

Some companies have created prebuilt analytic applications that span most departments in an enterprise. These analytic packages come with an integrated set of tools, data schemas, business views, and predefined reports and dashboards that significantly accelerate the time it takes to get a BI solution up and running. Many organizations are now evaluating packaged analytic applications to determine whether they are a more cost-effective alternative than developing BI solutions from scratch.

Download now


Beyond Reporting: Requirements for Large-Scale Analytics

October 2008

To increase the value of their data investments, many organizations are eager to move beyond the delivery of basic reporting capabilities. They want to empower business analysts—who sit at the intersection of data, process, and math—to create data-driven applications that deliver bottom-line insights from massive volumes of data.

There are many challenges to becoming an analytics-driven company. Organizationally, companies must overcome executive perceptions that analytics is too complex, expensive, and abstruse to offer sustainable value. They also need to challenge the traditional approach of running analytic applications on a specialized analytics server or having analysts explore data sets by downloading them to a local server or desktop application.

To apply analytics against large volumes of data, organizations need a purpose-built analytics platform that runs on a massively parallel processing environment and supports custom-built analytical programs that can be invoked via SQL. In addition, the analytic platform should enable IT departments to create sandboxes in the corporate database that give analysts free reign to add their own data and run high-performance analytics without impeding performance of other users on the system.

Download now


Bridging the Divide: Aligning Analytical Modelers and IT Administrators

July 2008

Simmering below the placid workplace environment in companies pursuing data-driven strategies is an internecine feud. Business analysts who use statistics and sophisticated machine learning techniques to coax hidden patterns and relationships out of large data sets to solve business problems are engaged in a never-ending battle with IT professionals charged with safeguarding corporate data warehouses and ensuring reliable operations of operational applications and systems.

Each side has dug in for a long siege, and neither has contemplated waving the white flag of surrender. Only a few have contemplated a truce, and fewer still have entered into peace talks to end the data wars. The irony is that both sides have much in common. With the right leadership, training, and technology, these antagonists can find common ground and resolve their differences. Each side can get what it needs, if not what it wants, providing their organizations with heightened productivity and bottom-line benefits.

Download now


The Four Imperatives of Data Governance Maturity

July 2008

Anytime data crosses an organizational boundary, it should be governed, whether you’re sharing data among business units internally or publishing data to customers, partners, auditors, and regulatory bodies externally.

TDWI defines data governance, in part, as:

  • An executive-level data governance board, committee, or other organizational structure that creates and enforces policies and procedures for the business use and technical management of data across the entire organization.

The many goals and tasks associated with data governance boil down to four imperatives—two organizational and two technical:

  • Maintain a cross-functional team and process (organizational)
  • Align with data-intense business initiatives (organizational)
  • Govern data usage via technical implementations (technical)
  • Automate data governance process via technical implementations (technical)

Download now


Second-Generation Collaborative Data Integration: Sustainable, Operational, and Governable

May 2008

This TDWI Monograph looks into the new set of trends that are driving the second generation of collaborative data integration:

The system consolidations typical of a green data center require data integration. Recent climate changes and the rising price of electricity have led many people to revisit the sustainability of data centers. In response, many corporations are reducing power consumption and the physical footprint of data centers by consolidating and virtualizing redundant data and hardware servers. Most consolidations require that data be transformed and cleansed to better fit the target, which demands tools and techniques for data integration.

Operational data integration is inherently collaborative, and it's growing as a practice. Operational data integration aggressively consolidates, migrates, collocates, and upgrades systems, usually in the context of a green data center program, ERP initiative, IT centralization, or merger. You can't kill and replace systems successfully without detailed collaboration with the business units that use those systems and the business managers who own and fund them.

Data integration's collaborations are increasingly structured by data governance. Business people now need direct oversight into data integration, due to new requirements for compliance and governance. And data integration is one of the first data management practices that a new data governance board controls, because data integration work effects changes that reach across multiple business units and programs. For these and other reasons, the collaborative side of data integration is progressively enabled and controlled by cross-functional data governance boards.

Download now


The Unique Requirements of Product Data Quality

March 2008

The majority of data quality software implementations have long been focused on customer data—more so than on other data domains like product data, financial data, asset data, location data, andso on. The focus on customer data helps explain why most data quality software techniques—themost common being data standardization, verification, and matching—were originally designedfor customer data, whether built in-house by IT or built into a software vendor’s tool. That’s greatfor customer data, but not so good for other data domains. Customer-oriented data qualitytechniques and tools can be retrofitted to operate on other data domains, but with limited success.There’s a need to redesign standard data quality techniques—and design new ones—that address the unique requirements of non-customer data domains, especially product data.

Download now


Complex Data: A New Challenge for Data Integration

November 2007

Data integration solutions have grown significantly in number and depth this decade. Today, data integration solutions are commonplace in corporate and governmental organizations, where they continually acquire, merge, and transport data. Integrated data is loaded into target databases and applications inside the organization or packaged into files and documents for exchange with other organizations. Growth has expanded both analytic data integration (which is largely about extract, transform, and load [ETL] for data warehousing) and operational data integration (which mostly involves the consolidation, migration, or synchronization of operational databases).

But there’s a catch. Despite the dissemination of data integration practices in recent years, data integration solutions continue to follow older design paradigms that ignore or short-cut key issues. As data integration practices and technologies expand to embrace complex data, data integration solutions must grapple with two tasks that are new to most data integration specialists: integrating data from complex and nontraditional sources and assuring the quality of data drawn from those sources. This Monograph explores the combination of these two tasks.

Download now


Unifying the Practices of Data Profiling, Integration, and Quality (dPIQ)

Download the monograph

October 2007

Data profiling, data integration, and data quality go together like bread, peanut butter, and jam, because all three address related issues in data assessment, acquisition, and improvement. Because they overlap and complement each other, the three are progressively practiced in tandem, often by the same team within the same data-driven initiative. Hence, there are good reasons and ample precedence for bringing the three related practices together. The result is an integrated practice for data profiling, integration, and quality, succinctly named by the acronym dPIQ (pronounced “DEE pick”).

Download now


Collaborative Data Integration: Coordinating Efforts within Teams and Beyond

Download the monograph

July 2007

Collaboration requirements for data integration projects have intensified greatly in this decade. On the technology side, data integration specialists are growing in number, data integration work is increasingly dispersed geographically, and data integration is more tightly coordinated with other data management practices. On the business side, business people have long taken an interest in data integration related to business intelligence and mergers, but they now need direct involvement due to new requirements for compliance and governance. This report provides insight into data integration collaboration, who it supports, best practices, and the collaboration management tools available.

Download now


Beyond the Basics: Accelerating BI Maturity

Download the monograph

April 2007

This paper introduces the TDWI Maturity Model, but focuses on the last two stages of development—the Adult and Sage stages. Although less than 20 percent of organizations claim to have entered these two stages according to TDWI Research, all BI practitioners will benefit from examining the characteristics of mature BI implementations. Knowing what’s possible with BI gives organizations a goal to aim for and the motivation to overcome the challenges and pitfalls that plague every BI implementation.

Download now


A Business Approach to Right-Time Decision Making

Click here to download the monograph

June 2006

Being able to respond quickly to business events can spell the difference between success and failure today. But accessing the right information is difficult for users when it's locked away in multiple systems that are accessed via legacy or ERP reporting tools. Although data warehouses can pull together large volumes of historical data, they usually cannot deliver just-in-time information

Consequently, business users face the choice of either learning several different tools to access critical information, or relying on professional developers to create sometimes outdated reports that offer minimal reuse and don’t anticipate follow-up questions. Users often make decisions from incomplete information or even none at all—choosing to rely solely on gut instinct.

To unlock the information in operational systems and provide users with a powerful blend of historical and current information, organizations should consider using BI products that incorporate EII technology. This technology can query data across multiple systems—including data warehouses, operational systems, Web services, and external data sources—in real time and deliver it to a report or performance dashboard for display.


Embedded Analytics: Closing the Loop Between Operational and Analytical Applications

May 2006

Click here to download the monograph

For business intelligence (BI) to reach its true potential within organizations and become pervasive, BI tools must be easier to use and provide insights into business events as they happen. The best way to simplify and operationalize BI is to embed it directly into the operational applications and processes that drive the business. This is the definition of embedded analytics, and it’s the next wave in BI.

Of course, embedded analytics are nothing new. Organizations have embedded BI functionality into applications and business processes for years. However, most embedded analytics to date have barely scratched the surface of what is possible. Today, however, visionary vendors and BI professionals are conjuring new ways to blur the lines between analytical and operational applications. They trumpet the benefits of composite applications, process-driven BI, business activity monitoring, BI services, operational dashboards, software-as-a-service models, and open source BI, among other things. With new development techniques that make embedding analytics into business processes and applications as easy as dragging and dropping objects onto a workbench, the future course of BI could change radically.


Best Practices in Data Migration

April 2006

Click here to download the monograph

Countless executives complain that they’re drowning in data, frustrated because they can’t realizethe value of this resource. But what feels like drowning is more like buckets of rain falling fromevery isolated application, database, portal, and spreadsheet. And the buckets won’t fit in the samerain barrel, because each has a unique data model, age, relevance, and quality. Trying to manage abusiness efficiently or make an enlightened decision based on dozens of data sources is like tryingto bathe in a dozen buckets of varying depths and cleanliness.

For many user organizations, this decade has been about addressing problems of this sort while chanting “do more with less.” On the one hand, this mantra tells us to wring more value from preexisting IT systems before considering new ones. On the other hand, it leads us to lessen the number of IT systems, whether to reduce costs or to focus information for greater visibility. Regardless of which direction the “do more with less” mantra leads, it usually manifests itself in IT projects for the migration, consolidation, upgrade, and integration of databases and applications.


The Keys to Enterprise Business Intelligence:
Critical Success Factors

June 2005

Click here to download the monograph

When end-user reporting and analysis tools madetheir debut in the early 1990s on Windows desktops, many experts believed the toolswould liberate end users from their dependency on the IT department to create anddeliver custom reports. It turned out that a majority of usersfound early versions of BI tools too difficult to use, and continued to rely on IT.

The advent of the Web as a BI delivery vehicle has made it easier for organizations todeploy business intelligence on an enterprise scale. The Web centralizesadministration, removes the need to install software on users’ desktops, and providesan intuitive user interface that reduces user training and support costs. However,simply migrating BI to the Web is not enough to deliver enterprise businessintelligence so that all employees can easily access, analyze, and act on relevant andtimely information.


Data Profiling:
Minimizing Risk in Data Management Projects

November 2003

Click here to download the monograph

“Be prepared.” It’s more than just the Boy Scouts’ motto; it’s a rule we use to guide our personal and professional lives.For example, most of us check the weather forecast in the morning to know how to dress for the day. Doctors and pharmacists analyze patient profiles before prescribing new medicines. Skilled entrepreneurs carefully evaluate the market before launching new products or services.

Oddly, it seems we leave this universal wisdom at the door when it comes to understanding data prior to integrating it with other sources of information. Often, we get halfway through a data integration project only to discover that our source data isn’t what we thought it was! Character fields contain numbers; the gender field has five distinct values; invoices reference non-existent customers; sales orders have negative values; and so on.

We scratch our heads, pull in a subject matter expert, and apply new rules to fix this “bad” data. We run the process again and the same thing happens—we discover new data defects and again have to stop and rewrite our integration code. As schedules balloon, costs escalate, and tensions rise, we realize that we’re stuck in an endless loop caused by undiscovered errors that pervade our data sources like mealy bugs in a grain tower. One data warehousing manager has described this pernicious cycle as “code, load, and explode.”