Exclusive to TDWI MembersView online: tdwi.org/flashpoint

TDWI FlashPoint Newsletter

Article Image

Feature

January 13, 2011

 

ANNOUNCEMENTS

New Best Practices Report: Visual Reporting and Analysis


CONTENTS

Feature
In Store for 2011: Jill’s
New Year’s Prognostications



Feature
Altering Database Design
for Software Products



Feature
It’s Time to Consider
Unstructured Data



TDWI Research Snapshot
Where Spreadmarts Lurk



Flashpoint Rx
Mistake: Failing to Define Data Governance



TDWI Bulletin Board
See what's current in TDWI Education, Research, Webinars, and Marketplace


In Store for 2011: Jill’s
New Year’s Prognostications

Jill Dyche
Baseline Consulting

Topic: Business Intelligence

When people ask how my year went, I say “Fine!” and then change the subject.

What I’m really thinking is that 2010 was a year of diminished expectations and compromised goals, in which people talked more than they listened. The words “maybe later” became entrenched in corporate vocabularies, and promising opportunities wound up in the gaping maw that is the scrap heap of progress.

Needless to say, I’m more optimistic about 2011. Before offering my prognostications for the coming year, and in the spirit of accountability--a word absent from the C-suite in 2010--I’d like to recall some of my predictions from last year: “Management adopts the phrase ‘data as an asset.’” Check. “IT gets serious about the cloud.” Check. “Decommissioned legacy systems.” Check. Mobile dashboards, local data governance, and social media? Check, check, check.

These items were predictable in their predictability, and many have since appeared on prediction lists for 2011. To avoid trotting out a new crop of soon-hackneyed prognostications, this year I’ve decided to make a single, yet far-reaching prediction that applies to almost every company I know:

Companies will start addressing the entrenched cultural issues that inhibit delivery and sabotage growth.

Why am I so confident in this single prediction, to the point of forsaking shiny buzzwords--some of them my own--in favor of a loftier, business-focused forecast? It’s because of what my clients have said recently. The following are quotes from executives, business users, and IT staff on some of our 2010 engagements:

  • “I am spending more time getting data off of various systems than actually using the data. You’ve heard that from everyone you’ve talked to here, but what you haven’t heard is this: I spend more time gathering data than I do on anything else. It’s my real job, and if my management knew that, I’d get fired.” --Customer service manager at a manufacturer
  • “We can’t tie the decisions we make to business outcomes. It’s all so much authority-in-a-vacuum. We know what we’re doing, we just don’t know how we’re doing.” --Chief financial officer at a regional bank
  • “We’re not a fact-based organization, and God knows we don’t need another tool. We need a systemic transformation.” --Segment manager at a specialty retailer
  • “My vendors have let me down. They’ve ceded to how we do things around here, which is project by project. I don’t have a data strategy, and I don’t have a long-term road map. My BI program is nothing more than a set of siloed projects, just like everything else.” --Chief information officer at a global high-tech company
  • “I’ve heard you speak at conferences about not launching data governance too early or it will become a dirty word. You know what our 'dirty word' is? Dashboard. If you say it, you have to put a dollar in a jar.” --IT project manager at a pharmaceutical company
  • “IT is waiting for [a large consulting firm] to tell us what to do. [The consulting firm] wants to build an MDM solution from scratch, and IT will probably say yes. Why? Because it means that someone will be delivering something. The fact that it won’t be us doesn’t seem to bother anyone.” --Chief marketing officer at a catalog retailer

My personal favorite:

  • “You don’t understand. I have built my own database to protect myself from other people’s databases.” --Doctor at a healthcare provider

I consider these quotes classic examples of the need for better BI and sustained data integration. I also know these aren’t issues of inferior technology or even staff aptitude, but of companies not being able to get out of their own way.

The problem is that everyone is expecting transformation from the top down. If your executive leadership has any tenure at all, though, don’t hold your breath. Any type of change has to start from the middle. Executive management isn’t intimate enough with the impacts of skewed cultural norms to change them, and on-the-ground staff doesn’t have the organizational authority to drive change in a sustained way.

This means the change agent is probably you--and maybe a handful of people you work with who get it. You begin the transformation by delivering a small, controlled project in a way that shows the tactical benefits of change. You craft guiding principles, assign decision rights, quietly engage a handful of stakeholders, and craft a unique development process that adheres as closely to best practices as you can muster. Then you deliver.

When we see each other this year and I ask how 2011 is treating you so far, you’ll say “Fine,” and you’ll mean it.

Jill Dyche is a partner and cofounder of Baseline Consulting, a technology and management consulting firm. She is co-chair of TDWI’s Master Data, Quality, and Governance Solution Summit, and will be a keynote speaker at the TDWI World Conference Las Vegas in February.

Feature

Altering Database Design for Software Products

Chris Adamson
Oakton Software LLC

Topic: Database Design

When designing your data architecture, it is essential to consider your technical architecture. This includes evaluating and responding to the capabilities of your software tools. Your business intelligence software and database management system can and should influence data mart design.

Unintended Consequences
Your business intelligence (BI) software is the primary mechanism for accessing information from the data warehouse. End users work with your BI software to perform ad hoc analysis, and expert developers use it to build complex reports and dashboards. The BI software controls how these people interact with the database. It influences the queries they can formulate and therefore determines the kinds of analysis they can perform.

When data architects don’t consider their BI software, the consequences can be crippling. Important metrics are unintentionally locked in the database, inaccessible to end users. The data mart becomes a black hole. Data cannot escape to be transformed into information. Although the BI software may be blamed, the true culprit is the database design.

Lost Potential: Compound Metrics
A common example of this failure is that users are prevented from performing some very powerful analysis. The clearest indicators of the health of a business are often compound metrics. These business metrics combine information about more than one business process. It is not enough for database designers to capture the components of these metrics; the design must also take into account how the BI software will combine them.

For example, a wholesaler takes orders from customers and fulfills them. Separate star schemas or cubes are constructed to analyze each process. Comparison of the two processes produces a third class of business metrics. Ratios of orders to shipments may factor into measuring and evaluating fulfillment rates, customer satisfaction, and inventory efficiency.

It is important to consider whether the BI software can support these compound metrics. Simply joining the stars will produce inaccurate results. (If a product was ordered once and appeared on three shipments, the order will be triple-counted.) Special processing is required to balance the relative cardinality of each data set. Many BI tools can detect that this is required and automatically apply special processing. Some tools may not be able to automate this processing in an ad hoc environment.

The schema designer must anticipate the behavior of the BI software and adapt the design accordingly. For tools that can automate comparisons, you may have to follow specific guidelines (such as avoiding degenerate dimensions). For tools that cannot, you will need to perform the comparison during ETL processing and provide an additional star (or cube) for comparison.

Failure to adapt the design may leave these important business metrics inaccessible.

Other Impacts: Performance
Business intelligence software is not the only tool that influences dimensional design. Your database management system (DBMS) may also have an impact on the design. Design options that seem convenient for reporting purposes sometimes harm the performance of the solution. Data architects must carefully weigh these trade-offs.

For example, consider a customer service operation that tracks the outcome of calls to a technical support center. This may require a “factless” design, in which no explicit metrics are recorded. Instead, queries will count calls, grouping the result (such as by product or support representative).

For convenience, designers like to include a special metric. In this case, it might be called “call quantity” and always be populated with the value “1.” This allows users to request “call quantity” rather than construct counts. This appears useful, but it may negatively impact performance. Instead of counting rows, the DBMS must read them to produce a sum. In this case, the impact may be minimal, but in others it can be severe.

Star vs. Snowflake
Students of dimensional design are usually taught not to model relationships among attributes in a dimension table. This guideline is useful, because the alternative “snowflake” design offers no analytic benefit per se. It’s important to drive this point home so that novice designers don’t overly complicate their solutions.

Some BI tools, however, cannot deliver key functionality without a snowflake design. Support of aggregate navigation, conformed dimensions, or process comparison may be dependent on a snowflake design. The DBMS performance may also be a factor. Most perform best with a star schema design, but some are optimized for the snowflake.

In these cases, the capabilities of the software tools warrant consideration of a snowflake design. Dogmatic avoidance of the snowflake will severely hamper the capability and performance of the final solution.

Balanced Decisions
Good design decisions strike a careful balance of business capability, ease of use, and performance. To make the right choices, you must consider the capabilities of your software products.

Chris Adamson is a data warehousing consultant, educator, and author. For more on this topic, see his latest book, Star Schema: The Complete Reference (McGraw-Hill, 2010).

Feature

It’s Time to Consider Unstructured Data

Patty Haines
Chimney Rock Information Solutions

Topic: Unstructured Data

For many years, data warehouses and data marts were only populated with structured data--that is, data from tables and files stored with defined lengths, data types, and constraints. Unstructured data was ignored because it was too difficult and costly to process and accurately identify which data was critical. However, documents containing unstructured data can provide valuable insight not found in operational data structures. There is a wealth of business value in these untapped documents.

Unstructured data is spread throughout organizations within Word documents, PDFs, spreadsheets, e-mail messages, notes, and operational system text fields. It can be found in:

  • Legal documents: contracts, statements of work, litigation notes, property titles
  • Customer comments: from call centers, social media, surveys, chat rooms
  • Notes: from physicians’ and scientists’ research, journals, property inspections
  • Messages: e-mail as part of staff touchpoint programs, operational processes
  • Operational documents: data created outside operational systems that provides additional detail not stored in operational data structures
  • Text fields: data elements in operational data structures used to store free-form text that further documents a piece of information

Off-the-shelf software is now available for managing unstructured data. This software extracts, organizes, and loads unstructured data into new data structures, which can then be used as standalone data or merged with structured data already stored in a data warehouse. Companies that have explored their unstructured data are seeing business value and strong ROI from mining it. In 2010, some applicants for the TDWI Best Practice Awards were working with documents containing unstructured data, analyzing critical information they were unable to identify through other methods.

Common business areas for mining unstructured data are:

Customer comments. Organizations can scrub and mine chat room and social media data to find customer opinions about the value and quality of their products and services and suggestions about improvements the organization can make. By analyzing this data in a timely manner, a company can make changes to its products, creating an immediate impact to its customers.

Comment notes. Companies can analyze comment cards or notes from customers with complaints or praise for their products to better meet the needs of their customers as quickly as possible.

Physician notes. Research institutions have decades of research, analysis, and clinical trials documented in unstructured data. Organizations can process these notes, combining them into a data structure for in-depth analysis by scientists and physicians. Visualization software has also been used against unstructured data to identify frequently used words and provide insight to previously unrecognized relationships.

Contracts. Volumes of contracts stored as PDFs are often manually removed from a folder and read individually to determine legal requirements, one contract at a time. When an organization needs to know the type of contracts, identify contracts that are expiring, or find contracts that have large penalty clauses, contracts and litigation material must be in a format that can be analyzed quickly with automated tools.

Staff touchpoint programs. Some organizations have large staffs and a high employee turnover rate. Because it is expensive to continually hire and train new staff, many companies have developed programs to improve morale, including touchpoint programs. Such programs reach out to staff members through informal e-mail. Organizations can analyze the impact of touchpoint programs by extracting data from e-mail replies, tracking and merging it with operational system data to ensure positive changes result from these programs.

It is time for data warehouse teams to reevaluate unstructured data as a source for their data warehouses. There is a wealth of information in these unstructured documents--previously ignored or considered unusable--for automated processing and analysis. Technology has caught up with the need to analyze and mine the volumes of unstructured data that exist in our organizations.

Patty Haines is president of Chimney Rock Information Solutions, a company specializing in data warehousing and data quality. She can be reached at 303.697.7740.

TDWI Research Snapshot
Highlight of key findings from TDWI's wide variety of research

Where Spreadmarts Lurk
Spreadmarts are everywhere. They exist in large and small organizations and are used to support every department and almost every imaginable business process. On average, organizations have 837 spreadmarts, although the median number is 30, which means a small number of organizations have a huge number of spreadmarts. Specifically, 17 out of 195 organizations said they have more than 1,000 spreadmarts, four have more than 10,000, and two have 40,000 and 50,000 spreadmarts, respectively.

In reality, few companies know exactly how many spreadmarts they have. When we asked respondents how they count the number of spreadmarts in their organization, a whopping 60% said, “We haven’t counted them.” Another 25% said, “We count them as we come across them.” Perhaps the organizations that said they have tens of thousands of spreadmarts are not anomalies, but are more honest than the rest! (See Figure 2.)

(Click to enlarge)
Click to view larger

Source: Strategies for Managing Spreadmarts: Migrating to a Managed BI Environment (TDWI Best Practices Report, Q1 2008). Click here to access the report.

Flashpoint Rx
FlashPoint Rx prescribes a "Mistake to Avoid" for business intelligence and data warehousing professionals.

Mistake: Failing to Define Data Governance
By Jill Dyche and Kimberly Nevala

Data governance has become a veritable rubric for all things data. Google the term and you’ll come up with references to data quality, metadata, data warehousing, data ownership, and data security, to name just a few. We define data governance as the organizing framework for aligning strategy, defining objectives, and establishing policies for enterprise information. What is really important is how you define data governance and how your organization understands it. As nascent as it is, data governance has failed in more than one well-meaning company because people misinterpreted its meaning, its value, and what shape it would eventually take in their companies.

The most common definitional mistake companies make is using “data governance” synonymously with “data management.” Data governance is the decision-rights and policy-making for corporate data, while data management is the tactical execution of those policies. Both require executive commitment, and both require investment, but data governance is business-driven by definition, while data management is a diverse and skills-rich IT function that ideally reports to the CIO.

Unlike CRM, which--after an initial failed attempt--would simply be rebranded “The Voice of the Customer” and relaunched, once data governance becomes a dirty word, an organization rarely gets a second chance. “You can’t use the word governance here,” one brokerage company executive confided recently. “We’ll have to call it something else.” Attempts at euphemistic substitutes don’t hide the fact that definitional clarity and a firm vision for data governance do matter.

Source: Ten Mistakes to Avoid When Launching a Data Governance Program (Q1 2008). Click here to access the publication.

TDWI Bulletin Board


EDUCATION & RESEARCH

TDWI World Conference:
Las Vegas, NV

February 13–18, 2011

TDWI BI Executive Summit:
Las Vegas, NV

February 14–16, 2011

TDWI Seminar:
Dallas, TX

March 14-17, 2011


WEBINARS

Visual Reporting and Analysis: Seeing Is Knowing

The Private Cloud: Your Next BI/DW Platform?

Data Governance, Data Architecture, and Metadata Essentials



MARKETPLACE

TDWI White Paper Library
Beyond the Data Warehouse: A Unified Information Store for Data and Content

TDWI White Paper Library
Metadata-Driven ETL Using Expressor Semantic Types

TDWI White Paper Library
The Definitive Guide to BusinessObjects Business Intelligence for SAP


MANAGE YOUR TDWI MEMBERSHIP

Renew your Membership by: [-ENDDATE-]

Renew | FAQ | Edit Your Profile | Contact Us

 

TDWI Home | About Us | Research
Publications
| Certification | Education
Partner Members | Membership

Copyright 2011. TDWI. All rights reserved.