Q&A with the Experts
A business intelligence or data warehouse implementation can be a formidable undertaking. In these pages, leading business intelligence and data warehousing solution providers share their answers to the questions they hear often from industry professionals. Tim Feetham, an independent consultant, provides his analyst viewpoint to each Q&A.
Introducing performance management reporting to a diversely skilled workforce is incredibly expensive and time-consuming. What is the best strategy for driving adoption of new reporting applications across the enterprise?
Large user communities, as a whole, do not wantto learn new technologies. They prefer to usethe productivity applications to which they areaccustomed. Therefore, in order to ensure thatoperation performance management solutions arewidely used, we recommend deploying them infamiliar means such as personalized, interactiveWeb applications and fully functional spreadsheets.This way, we immediately overcome the firstobjection to new technology. Instead, users say:“I knew how to use it when I opened it.”
In performance management, getting relevantfeedback to the folks who can make a differenceis essential. Organizations that undertake thistask must also understand that what is relevanttoday will change tomorrow. These organizationsface two issues: how to generate broad adoptionand how to stay flexible. These needs point toWeb-based reporting technologies that deliverreporting, visualization, and analysis tools, plusseamless spreadsheet integration under a unifiedbut customizable interface. Users will be quick toadopt this technology, and support organizationswill be able to focus more on tailoring the productto the pressing business issues at hand.
Why is EIM so important to BI?
It’s not enough for business intelligence (BI)software to simply produce a pretty chart or report.It has to deliver information to people that iscredible and accurate—information that people cantrust. Look for BI vendors with integrated products,services, and partnerships that can help you deliverenterprise information management (EIM). EIM isthe strategy, practices, and technologies neededto deliver a comprehensive approach to managingdisparate data in order to drive performance. EIMrequires robust data integration and data qualitycapabilities that are tightly linked to the BI platform.
Often organizations spend considerable money onbusiness intelligence (BI) technologies, but don’tsupport their purchases with sufficient planning.Such action will likely result in poor data quality,a lack of timely access to requisite information,and a sense that BI was a poor investment. Savvyorganizations will select their BI technologieswithin the framework of an enterprise informationmanagement (EIM) strategy. EIM acts as anumbrella program for data architecture, datagovernance, data warehousing, and informationdelivery initiatives, as well as an organizing principlefor information workers. When BI technologyis deployed in such an environment, the payoffis dramatic.
User adoption of our business intelligencetools is low. What’s wrong?
There are two possible reasons for disappointing BI tool adoption: Either the tool is a poor fit, or users don’t need or trust the data it uses.
A poor fit may result from one or more of the following reasons:
- IT chose the tool without user involvement or adequately defined requirements.
- A tool evaluation was performed without guidance from experienced practitioners.
- Someone has used the tool previously and figures it will work well again.
Of course, if users don’t trust or need the data provided, even the best BI tool won’t help.
Business intelligence (BI) tools by themselves donot deliver business value to the organization, andwithout business value, users have little incentiveto learn these tools. Having high-quality datathat holds the potential answers to key businessquestions and that is stored within an easilyunderstood structure is a prerequisite for broad BItool adoption. Further, BI tools must fit the needsof different types of users in order to gain wideadoption. Wide adoption requires the involvementof both casual users and analysts. Organizationswithout deep experience in these areas will be wellserved by working with knowledgeable consultants.
What is the role of data qualityin a customer data integration(CDI) initiative?
Put simply, effective CDI requires powerful dataquality technology. When creating a “single viewof the customer,” organizations need a technologythat can standardize, verify, and correct data acrosssources. In addition, data quality technologiesprovide matching technology—also called identitymanagement. This helps resolve instances of thesame customer record across sources—and allowscompanies to understand the total value of everycustomer. CDI vendors have typically partnered fordata quality technology, but by using a data qualitysolution as the foundation of CDI, you can realize afaster time-to-value from CDI initiatives.
There can be no more essential data quality effortthan in the area of customer data integration (CDI).CDI is tricky. It requires that different departments,such as sales, service, and accounts receivable, sharekey pieces of customer data. Given that this datareflects the quality of the interaction between anorganization and its chief revenue source, poorcustomer data quality will lead to a deleteriouseffect on the bottom line. Although individualdepartments may have specialized data needsthat arise from unique customer interactions, wellmanageddeployment of CDI technology can ensurethat shared data is consistent, accurate, and timely.
What is the difference between Ingres and Postgres?
Ingres emerged circa 1974 under MichaelStonebraker and Eugene Wong at UC Berkeley. In1980, Ingres and Oracle entered the commercialworld as the two leading relational databasemanagement system (RDBMS) products. Ingresis the progenitor of other RDBMS productssuch as Informix, SQL Server, and Tandem’sNonStop, among others. Stonebraker laterdeveloped Postgres, a derivative of Ingres, as anobject-relational DBMS (for unstructured datatypes). Postgres remained open source, but Ingreshad engineering oversight since its commercialavailability. Computer Associates bought Ingres in1990, and it is the database within CA products.Ingres is ISO 9001 certified.
Michael Stonebraker released Postgres, an object-relationalDBMS, to the open source communityin 1986. It has since gained wide acceptance inthat group. Ingres, which he helped create in1974, made its reputation as a leading commercialRDBMS. Computer Associates purchased Ingresin 1990. CA made it its key database product andencouraged its customers to move to Ingres forY2K. Last year, CA released Ingres into the opensource community. Although both products arenow in the open source community, Ingres hasbenefited from years of market discipline, makingit an ideal engine for low-cost database appliancetechnology.
Why is having a unified business performancemanagement system criticalto the success of today’s enterprise?
This is being driven by the need to progress fromoperating tactically to taking a more strategicperspective. It is no longer sufficient to simplyreport results that are not tied to strategicobjectives or business plans. Enterprises need tolink their strategic objectives with operationalgoals. To implement, businesses must report andanalyze financial and non-financial informationin a coordinated fashion. This then allows themto monitor and compare performance to plans,adjust plans to respond to changes in the businesslandscape, and perform advanced planning bymodeling potential scenarios. A unified BPM systemenables this movement to strategic analysis byintegrating a BI platform with a suite of financialapplications to deliver tightly integrated capabilitiesto the business user.
A business performance management (BPM)initiative that provides a single set of key metrics(leading indicators) to upper management does nothave anywhere near the return of a unified BPMinitiative. Only naive managers ignore the fact thatindividuals at all levels will respond to changes inhow they are measured. A unified BPM initiativethat includes the right BI technology will measureemployees and business partners with five to sevenkey metrics that support the goals of the enterprise.These individuals must also be the ones to affectthese measures. Unified BPM must accommodatenew metrics as the environment changes.
What circumstances prevent widespreadadoption of operational BI?
Performance is one of the key requirementsof operational BI—people making operationaldecisions need information in seconds, nothours or days. Unfortunately, most BI solutionscan’t deliver this level of performance, and thatprevents companies from tapping the potential ofoperational BI. The good news is that as companiesadd data aggregation software to their reportingenvironments, we’re now seeing operational BIbecome a reality. Data aggregation softwarefrom HyperRoll, for example, accelerates existingreporting solutions to deliver information inseconds. Faster performance will be the catalyst forwidespread operational BI.
Many managers assume that operational BI willkill the performance of their transaction systems.They continue to rely on the operational reportsthat they already have. Although these reportsare usually static in nature, not timely, difficultto change, and may track the wrong metrics,organizations are reluctant to deploy flexiblereporting and analysis tools on top of theiroperational data. However, the technology existswhere operational BI can be safely deployed byimplementing BI technology that has access notonly to transaction system detail, but also to lowlatency,aggregated data where managers andanalysts can quickly recognize trends.
Do I need to choose one type of OLAPtechnology (ROLAP or MOLAP), or arethere reasons to implement both? Also,is Linux ready for hard-core businessintelligence?
Different users require different tools, so it may notbe possible to standardize on just one. However,there are benefits to reducing the number ofsupported tools. Look into creating a businessintelligence competency center to support yourtools, and ensure your database supports manyvendors’ OLAP functions and metadata transfer.
And, yes! Linux is ready. Look into the variousofferings from major vendors and determine whichbest meets your needs.
The OLAP versus ROLAP argument boils down tospeed of analysis versus flexibility. OLAP is usuallythe analyst’s choice, and ROLAP favors reporting.Although evolving database technologies and clienttools are blurring the lines between these twoclasses of technologies, performance differencescontinue to exist, and an organization that wishesto get the most out of its data resources willsupport both.
Although open source BI technology is still in thedevelopment stage, there are increasing numbersof mainstream BI and database vendors that haverobust Linux offerings. Linux has become a keyoperating system in the BI server space.
We’d like to move to a more standardizedapproach to data integrationacross our organization, but how dowe justify that to the business?
IT organizations can do three things to demonstrate the business value of enterprisewide data integration:
- Tie data integration projects to specific, urgent business initiatives with measurable impact, such as merger and acquisition consolidation or regulatory compliance.
- Avoid the “big-bang” approach. An enterprise data integration infrastructure should be rolled out incrementally to reduce risk.
- Emphasize the need for robust data governance to ensure data quality, auditability, and availability, and to manage and protect information as a valued enterprise asset.
Individual projects that involve data integration,such as migration to a new ERP or projects thatinvolve data warehousing, are often financiallyjustified as part of a package of larger benefits.The irony of this situation is that in order tomaximize benefit-cost ratios, the selection of adata integration technology for a given projectmay exclude features needed on the next project.This leads to multiple data integration tools andstrategies that, taken in the whole, are moreexpensive. Organizations that implement a dataintegration competency center with an enterpriseclassdata integration technology will reap thebenefits of scalability.
How important is seamless reportingand analysis to my business users?
More than ever before, business users cross the line from reporting to analysis and from consuming data to investigating it. Seamless reporting and analysis empowers business users to glean more information from every report, all within one interface.
Seamless reporting and analysis is essential for business user workflow processes and the prudent use of IT report development resources. As business users are more empowered, the IT staff is no longer the bottleneck for new report development.
With true seamless reporting and analysis, companies gain administrative and resource efficiencies because they no longer have to deploy separate dashboarding, reporting, and analysis technologies. The most advanced BI platforms infuse every report, scorecard, and dashboard with the ability to fully analyze the underlying data.
With a few exceptions, BI vendors are moving orhave moved toward seamless reporting and analysis.They are responding to customers desiring tomaximize BI benefit-cost ratios. Not only does thistechnology simplify implementation, training, andsupport costs, it also makes users more efficient,whether they are casual consumers of informationor full-time analysts. Casual users who view reportsbut also want to do some limited exploring can doso without having to learn another tool. Analystswill be freed up from having to respond to simpleinformation requests and will be able to spend theirtime on more complex issues.
What’s database partitioning, andwhat can it do for me?
Database partitioning is when a database is divided into separate components (partitions) that are distinct, independent parts. It’s a key tool for building and maintaining multi-terabyte databases, since it offers considerable manageability, performance, and availability advantages.
With partitioning, maintenance operations can be focused on particular portions of tables, allowing DBAs to pursue a divide-and-conquer approach. Maintenance can be performed on certain parts of the database while others remain up and running.
Performance can be improved through partitioning by limiting the amount of data to be examined or operated on. And from an availability perspective, if one partition of a partitioned table goes down, the other partitions will remain online and available to users.
Database partitioning means getting the most foryour database dollars. Data warehousing initiativesor mergers and acquisitions can put significantupward pressure on database storage. Althoughmost modern database technology can storelarge amounts of data, providing high-accessperformance along with high availability areother issues. Partitioning plays a significant rolein large database performance and maintenance.The leading database vendors have partitioningcapabilities. However, not all take the sameapproach. Some products are geared towardtransaction systems, and others toward datawarehousing. Organizations will be well servedby making sure that their database partitioningcapabilities match their needs.
What is E-LT, and how is it differentfrom ETL?
E-LT stands for extract, then load and transform (asopposed to ETL, which means extract, transform,and load). The difference may seem subtle, but thisswapping of letters makes a big difference withregard to architecture and performance. With theETL approach (used by the majority of integrationtools today), all of the data has to transit throughan ETL engine. Worse, when a lookup of targetdata is required, data used for this lookup alsoneeds to be moved from the target to the ETLengine, where all processing occurs. The E-LTapproach, implemented mostly by new-generationintegration software, uses the power of theRDBMS engine to execute all data mappings andtransformations. All heavy-duty processing is doneon the target (or in the sources when appropriate),leveraging the dataset-processing capabilities ofRDBMS engines and decreasing data exchangesover the network. The performance gain of E-LTover ETL can reach several orders of magnitude!
Vendor support for E-LT (extract, load, and transform)as opposed to ETL (or extract, transform, and load)recognizes that many data warehousing teams havedone their transformation programming in the targetdatabase itself. Procedural extensions to databaselanguages such as PL/SQL and TSQL have madethese technologies quite robust in capability andperformance. When a data warehousing program haslimited development staff, the database administratorwill likely be pressed into service for programmingduties. Because it relies on the transformationcapabilities of the database, E-LT technology shouldprovide that database administrator with a greatercomfort level and quicker start-up time.
Is it possible to accelerate the performanceand reduce the complexity ofmy BI system while using the hardwareand software my team already knows?
Some performance improvement can be gainedthrough tuning and optimization techniques,technology upgrades, additional hardware, andhardware upgrades. But these methods arecostly and have limitations. You might consider apreintegrated appliance, but these are based onproprietary, preconfigured platforms that can getexpensive as your BI demands scale up.
A column-based analytics server is an economicalchoice to deliver the speed, scalability, and queryflexibility required by users without requiringchanges to your organization’s BI ecosystem. Andtypically, a column-based analytics server has builtinfeatures that reduce the complexity of managingthe BI ecosystem—requiring fewer DBA cycles,using standard hardware and OS, and integratingeasily with current BI tools and applications.
Poor performance limits the usability of a businessintelligence system. Good data access technologycannot overcome the limitations of a poorly designedand/or poorly tuned database. However, organizationsseeking top performance from their businessintelligence systems while reducing system complexitymay want to consider specialized analytics servers.Unlike general-purpose database managementsystems that are designed to support transactionapplications, these specialized servers are designedespecially for business intelligence deployment. Whenchoosing an analytic server, a wise organizationwill look for one that has a standard interface withtheir business intelligence technology. It should alsorequire minimal maintenance.
We need to understand more aboutour customers’ purchases, such aswhen they first purchased and theamount of their largest purchase.How can we quickly and easilycompile this information?
Data warehouse experts agree that aggregates arethe best way to speed warehouse queries. A queryanswered from base-level data can take hoursand involve millions of data records and millionsof calculations. With precalculated aggregates, thesame query can be answered in seconds with justa few records and calculations. DMExpress providesseveral functions for aggregating your data. Youmight try creating an aggregate that keeps thedates of the first and largest purchases, the firstpurchase amount, the largest purchase amount,and the average and total purchase amounts. Torank your customers by number of orders, take acount of the records summarized.
These questions and future queries about customer behavior can be answered from a well-designed database. Such a database will contain history (such as when the customer made his/her first purchase), and it will be integrated, since we are interested in purchases that might possibly involve data aggregations over different products and different locations. A savvy database designer will also include summary tables that precalculate data that can provide answers for questions (such as the total amount spent by customers on product x
). A high-performance ETL tool will help produce these summaries with a minimum of delay.
Back to Table of Contents