Q&A with the Experts
A business intelligence or data warehouse implementation can be a formidable undertaking. In these pages, leading business intelligence and data warehousing solution providers share their answers to the questions they hear often from industry professionals. Tim Feetham, an independent consultant, provides his analyst viewpoint to each Q&A.
What should I look for when investigating data integration technology from a BI vendor?
Not all extraction, transformation, and load tools are created equal. Performance remains key to batch data movement. Enterprise-class data integration platforms should be able to move data in real time, access a wide array of sources, and have built-in profiling and cleansing capabilities. Once established, a unified data integration and BI platform should allow for impact analysis from source systems to end-user reports, the ability to share and audit ETL and BI metadata, and provide visibility into data lineage to users of BI tools. This will help reduce administration costs, improve BI user adoption, and help your standardization initiatives.
BI vendors have been expanding their tool suite capabilities over the past several years by strengthening their data management infrastructure technologies. These products have been referred to historically as ETL tools. ETL has most often been associated with batch processing, but the growing demands of data integration initiatives that combine BI analytics and real-time access to data have driven ETL beyond its core functionality. Real-time feeds, operational and development support, auditing, plus metadata integration between tools are critical to a modern BI program. Customers will be well served to check out their BI tool suite vendor’s offerings in this area.
As organizations implement BI solutions, what considerations would help them to achieve cross-organizational visibility?
The first consideration is how best to leverage your existing data assets and data integration strategies, including enterprise information integration (EII) and extract, transform, and load (ETL) software. Second, choose a BI vendor that offers visibility into other data integration technologies in addition to its own. Third, select a single BI vendor with an open data strategy so you can reap the benefits of standardizing on a BI tool without losing previous data or investments. Achieving visibility into a single version of the truth across the organization contributes to predictability, accountability, and transparency—the pillars of superior corporate performance management.
Organizations that have delivered data marts and data warehouses through the use of a good set of BI tools often find themselves asked to expand the reach of these tools to other data sources. Business users are especially interested in gaining access to ERP data such as SAP, as well as heritage systems via enterprise information integration (EII) technologies. They may want to include data from these sources along with the data warehouse in the same report. Wise BI managers will select BI tool suites that provide comprehensive data access and provide the functionality, such as scorecarding, to meet future demand.
- Request more information about Collaborative Consulting
| || Conversion Services International |
The cost of data quality improvement is growing at an alarming pace, and correcting data quality often means duplicated efforts, divergent technologies, and inconsistent remediation strategies. What are some emerging trends and forward-thinking approaches for addressing these issues?
One trend gaining momentum within corporate inner circles is establishing enterprisewide data quality competency centers. Governed by highly visible, respected business leaders and supported by business and IT professionals, these centers:
- Improve the accuracy and reliability of data through developing and disseminating consistent standards, best-practice methodologies, and other technologies
- Mitigate risk and enhance effectiveness of strategic data-driven initiatives
- Govern all processes that monitor and continually improve data quality performance
- Ensure that data quality concepts and value are instilled within the corporate culture
The general success of data warehousing initiatives over the past decade has been tempered by the spotlight that these initiatives have shone on poor data quality. Although the first reaction has often been to “shoot the messenger,” intelligent organizations are realizing that poor data quality is not a data warehousing issue. It is a corporate issue. One forward-thinking healthcare organization addressed this issue by creating an independent data quality office headed up by a chief data quality officer who was an MD. This organization was ultimately able to measure the value of the DQ office through lives saved.
How does data monitoring fit into an existing data quality program?
Regardless of the industry or company size, bad data is everywhere. The obvious solution is a data quality technology that can improve data integrity in a “once-and-done” project. But maintaining good corporate data takes constant vigilance. To make data quality a corporate priority, organizations must institute a data management program that includes a continual, routine control mechanism. Data monitoring takes the same rules from an initial data quality effort and applies them over time to enforce corporate standards. This allows companies to understand trends about data integrity—and flag and resolve problems before they significantly impact operations.
The issue of poor data quality often comes to light through data warehousing initiatives. The tendency to want to “fix” the problem through hard-coded data warehousing load programs offers an incomplete solution and may mask problems in operational systems. However, the cost of modifying those operational systems to improve data quality might be quite high. Data quality monitoring software, which incorporates business rules and can alert the organization to quality issues, offers a solution to this conundrum. Data warehousing teams can deploy this software between the extract and load functions in order to notify the organization of data-quality problems.
How can I optimize a multi-vendor data warehouse platform?
The integration effort required to provide optimal data warehouse performance is often underestimated. Many different technologies require optimization, including server, storage, OS, and RDBMS—at a minimum. These are typically acquired from different vendors, who make no promises and often point fingers when pressured about performance. Tunable parameters exist, but the selection is daunting. Considering all this, the simplest way to achieve reliable data warehouse performance is to push the burden of integration and optimization onto the shoulders of a single vendor who accepts responsibility for ensuring performance of the entire stack.
If you are like many data warehousing professionals today, you started out small and supplemented your data warehousing platform as you went. At some point, the platform became complex enough that optimization became an issue. A class of cost-effective technologies is emerging to address this need. These are technologies that come with integrated hardware, operating systems, and DBMS engines. The best of breed include high-performance parallel processing, support ANSI standard SQL, provide interfaces for all of the major BI and ETL tools, and require minimal tuning. They are essentially plug-and-play upgrades to your data warehouse platform.
| || Dun & Bradstreet Sales & Marketing Solutions |
Why is establishing 360-degree views of customers through BI/DW applications so critical?
The ability to see and understand customers from every angle is the crux of ROI from a BI/DW application, and it paves the way for increased corporate revenue. With 360-degree views of every customer, you can more accurately calculate their total profitability, lifetime value, and credit assessment—your risk exposure. Identifying all the points in a customer organization also uncovers new opportunities to sell more deeply and widely. The result is more sales opportunities and increased customer acquisition—and greater profitability.
A 360-degree view of a customer is not just about collecting all of the interactions that a customer has with an organization into a single application. It is also about understanding where that customer fits with other customer interactions over time. Operational systems facilitate day-to-day individual interactions with customers, but the data warehouse offers the type of historical and integrated data that will support activities such as identifying a customer within a group of other customers. The benefits of identifying a customer within a group include understanding the risks and service opportunities that a given customer is likely to present.
| || Group 1 Software, A Pitney Bowes Company |
Weekly, my users request reports requiring new data sources that aren’t (yet) in the DW. How can these requests be accommodated without compromising the DW and impacting other users?
This is a common situation. Group 1 OpenLink provides enterprises with a BI tool that enables users to integrate new data sources with data from the DW without compromising the data integrity. Group 1 OpenLink works by allowing any ODBC- or JDBC-compliant software application to leverage the analytical power of Group 1’s Data Flow Server. Business analysts working with BI tools such as Actuate, Brio, Business Objects, Cognos, Crystal Decisions, and Microsoft Excel can now use Group 1 OpenLink to transparently incorporate advanced analytics into existing reports and BPM solutions.
Many of those requests will be for operational data. As organizations realize the benefits of their data warehouse–centric BI platforms, their business users are demanding direct access to systems using those same BI technologies. In response, IT groups are implementing enterprise information integration (EII) platforms. EII provides access to operational data for most common BI platforms. EII also provides data transformations “on the fly” so that information is delivered with names and formats that meet organizational standards. EII does not take the place of the data warehouse when users need historical data, but it can provide integrated data for operational reporting.
Is there a better way to manage my data warehouse or data mart while improving my query performance?
Ventana Research recently found that most companies react to the issues of query and report performance. In addition, organizations tend to employ mitigating strategies that provide only minimal improvements. To realize the high performance required by today’s demanding analysis and reporting needs, organizations must take a proactive, structured approach to managing performance. This was not possible until a new category of technology became available: data aggregation solutions. These innovative solutions can help you proactively manage query and reporting performance to meet the ever-changing and growing needs of your company.
Data warehousing initiatives often grow out of IT-centered, enterprise reporting efforts that have “hit the wall” in terms of delivery turnaround and performance. A data warehouse will definitely benefit such an effort. However, data warehousing initiatives usually face heavy demand for ad hoc reporting and analysis from business users. Ad hoc access will add value to the warehousing initiative, but pose a serious challenge for the database designer. Design options include adding summary tables, multi-dimensional cubes, or a technology that provides aggregation, caching, and compression algorithms to the existing technical architecture. Data warehousing teams should check out this last option.
What should I consider when choosing a data migration solution?
Determine the scope of the overall effort. It is useful to review the following when choosing a solution:
- Number and complexity of legacy systems
- Type of migration (i.e., batch, synchronous, or real-time)
- Legacy data quality
- Amount of data and history to be converted
- Target application architecture
- Bandwidth and availability
Next, IT organizations need to select the most appropriate approach given the project scope. Flexibility, reusability, and total lifetime cost should be considered.
The term “data migration” seems to describe a one-time event. However, any organization that chooses a data migration tool based on that perception may be making a mistake. The care and feeding of BI data stores are ongoing, with load management requirements and ever-expanding data migration processes. In addition, most organizations will migrate different operational systems over time. Therefore, selecting a data migration technology that can not only meet the needs of the project at hand, but can also be expanded with the options needed to meet the requirements of migration projects on the horizon, makes good business sense.
Does BI standardization only have the CFO and CIO’s best interests in mind?
Arguably the biggest driver for BI standardization is lower ownership costs, followed closely by streamlined processes and improved efficiencies. It is clear, then, that both the CFO and CIO stand to gain by consolidating their investments in BI. Ultimately, however, it is the business user who reaps tremendous rewards from a single BI platform that is a comprehensive, straightforward solution. MicroStrategy continually puts the business user’s needs first as the only BI platform with intuitive, easy, one-stop shopping for monitoring, reporting, and analyzing business performance. MicroStrategy customers such as Ace Hardware, Sprint, and Shaw Industries have standardized on the MicroStrategy platform to drive cost efficiency, optimize resources, and enhance business productivity.
No, the whole organization stands to gain. Any organization that standardizes on a single BI platform can expect synergies between business users who can tackle business problems using a common tool. There is nothing more frustrating to a business user than trying to find somebody who knows how to do a certain operation with the tool he or she is working with. This results in non-productive time that is often hard to quantify. Without standardization and a formal support program, plus the development of informal support among the business community, the organization can expect to collect a lot of shelf-ware.
What kind of transformations can I apply to large data volumes to achieve faster query processing?
You can speed your data-intensive applications by employing the following techniques: At the source level, convert database tables to flat files, and vice versa. At the record level, use joins, sorts, aggregates, or simply copy records to the appropriate target(s). Data type and format conversions such as arithmetic operations, pattern matching, and conditional operations are appropriate for the record level. Applying these transformations will cleanse your data, create business rules for data quality, and quickly yield only the information you need for further analysis.
Selections and precalculations need to be at the top of your list for data transformations that will lead to faster queries. If you move selection to the extract stage using an industrial-strength sort program, you will also gain performance on the overall time that it will take you to load your BI data store. Building tables with popular summary statistics and derived columns calculated during the load process will not only make queries run faster, but also the user will not have to spend as much time formulating the query and that user will generate more consistent and accurate reports.
| || Trillium Software®, a division of Harte-Hanks |
How do I secure executive buy-in for my data quality initiative?
Detail the specific business problem your company is faced with and how a data quality solution can impact the organization as a whole. For example, explain that: “The procurement department is losing $10 million per year because they are not able to negotiate competitive rates for suppliers.” Underscore the specific business benefits you expect to receive from the data quality project—a more accurate view of supplier relationships to negotiate price and save the company millions. Then discuss other business benefits that will occur—better customer service, new business opportunities, and the ability to derive value out of existing procurement systems.
One way to secure executive buy-in for a data quality initiative is to document how poor data quality affects different areas of the organization. Although data quality problems may surface during a BI initiative, most can be traced back to operational systems. As such, these problems can have impacts on operations and strategic decision making. By quantifying and combining the costs of the problems in these areas, with the assistance of those managers who are affected, you can often put forward a compelling case for a data quality initiative. You will also gain the support of management in the process.