Q&A with the Experts
A business intelligence or data warehouse implementation can be a formidable undertaking. In these pages, leading business intelligence and data warehousing solution providers share their answers to the questions they hear often from industry professionals. Mark Hammond, an independent consultant, provides his analyst viewpoint to each Q&A.
Actuate
How can I accomplish true spreadsheet automation?
Eliminate common problems at their source so that the data is correct, complete, and secure, the formulas and design structure of the document are sturdy and accurate, and the process is repeatable and reliable. Focus on:
- The data. Eliminate manual entry, copy and paste, and .CSV importing as much as possible; instead, manage queries through an enterprise metadata layer to ensure accuracy and security.
- The design. Use a “blueprint-based” design approach and tools specifically meant to improve design practices.
- The distribution. Automate with a server-centric architecture that manufactures pure Excel documents and does not require an Office plug-in to pull data in order to refresh the document.
Analyst Viewpoint
Despite glitzier BI applications, spreadsheets remain a highly popular data analysis tool. Loyal users have been willing to tolerate the risk of errors introduced through manual data entry, as well as the time required to load and manipulate data. The business as a whole, of course, is more risk-averse and frowns on unrepeatable processes, redundancy, and duplication of “spreadmarts.” Spreadsheet automation can offer a significant improvement by centralizing spreadsheet creation and distribution in a consistent framework. Adopters should take pains to implement appropriate access and security mechanisms, and some organizations will want to examine spreadsheet automation from a compliance perspective to ensure verifiable data integrity.
Business Objects
Our company has seen sizable growth in our Internet sales channel over the past few years, but we know very little about thecustomers who use—or don’t use—this channel. How can we optimize even more sales and profitability through this channel?
Integrate customer segmentation data from a third-party source with your transactional data from your Web site totarget customers who are not yet using this channel or who abandon a purchase before completion. An online ticketbroker recently did this and gained greater understanding of customer buying patterns. Now it can anticipate demand anddeploy changes to its marketplace in near real time to accommodate market-changing events.
Analyst Viewpoint
Third-party demographic data that matchesthe identity of customers who have purchasedfrom your Web site can provide ahuge advantage in building out profiles byage, gender, location, and buying patterns.Look to capture as much information as possibleabout buyers during your e-commercetransaction processes (short, unobtrusivesurveys can be effective). Close analysis ofWeb site traffic patterns can pay off withrevealing trends showing how surfers got toyour site, and where they go next. Over time,you can build a foundation of segmentedcustomer information that is ripe for BI analysisto determine sales channel effectivenessand customer preferences and to supporttargeted marketing campaigns with specialoffers, e-newsletters, and the like.
Collaborative Consulting
We have an existing data warehouse that was built to support the business, but usage is lower than expected. How canwe foster adoption?
The answer is simple, but the challengeis not. Give the business what it needs,not what you think it needs, or what theyask for the first time out. It is difficultfor users to articulate exactly what theywant before they see it for the first time.Low adoption or waning adoption is notuncommon once the initial enthusiasmwears off and the limitations becomeapparent. Revisiting the business purposeand applicability frequently with the usersis key to success. Continually ask thequestion, “What purpose does the datawarehouse serve for the business, and howwell is it serving that purpose?”
Analyst Viewpoint
To increase the adoption and value from adata warehouse, it’s essential to first understandwhy usage is lower than expected.Survey the user population on what they likeand what they don’t. Identify the pain pointsthat users experience. Assemble a focusgroup that drives collaboration betweenbusiness and IT. Training sessions providean opportunity to hear user concerns andcommunicate the DW value proposition.Recognize that force-feeding new technologiesoften generates resistance and subparadoption. If your users are reluctant toabandon Excel, look into Excel add-ons thatinterface with your data warehouse to helpincrementally boost usage.
DataFlux
What’s the best way to start a data governance program?
Any data governance program will cover three elements: people, policies, and technology. To begin any effort, companiesshould assess their level of maturity across each of those elements. For the “people” component, organizations can measurewhat level of support the governance effort has (executive, director) as well as the number of resources devoted todata stewardship. “Policies” refers to the number and complexity of business rules and controls that the company can implementwithin the IT environment. Finally, “technology” refers to the data profiling, data quality, and data monitoring capabilitiesavailable to enhance and maintain the value of corporate information.
Analyst Viewpoint
The short answer — keep it simple! Startyour data governance program by usinga framework of the six basic questions ofwho, what, where, when, why, and how.This clearly defined structure can bringfocus to your efforts and plainly delineatechallenges, objectives, and roadmaps for adiverse group of stakeholders. Organizationsneed to be sensitive to the risk of a datagovernance program spiraling into unnecessarycomplexity and contradictory conceptsfrom the start. By starting simply, you canbuild a readily understood foundation thatprogressively matures as participants masterthe nuances and details typical to data governance.Naturally, organizations will benefitby researching best practices and lessonslearned from companies that have adopteddata governance programs.
DATAllegro, Inc.
We would like to encrypt our data warehouse in order to be compliant with recent legislation, but we are concerned about performance. Is it possible to encrypt data but not affect data load and data query performance?
Absolutely. Many data warehouse administratorsassume that data encryption isexpensive, difficult to administer, and willcreate a significant degradation to performance.This is because most encryptionschemes rely on software-based encryption,or they try to employ encryptionof data “in flight.” With hardware-basedencryption for data “at rest,” a dedicatedencryption engine sits between the databaseand the disk storage used by eachserver. Since all data is decrypted beforethe database software starts processingit, all queries and loads work the sameas before, with no significant impact onperformance.
Analyst Viewpoint
Identifying and implementing an ideal encryption solution can be nearly as complex as encrypted data itself. The choices typically involve trade-offs in price, performance, and degrees of security. Hardware- based solutions can sidestep the load and query performance penalty inevitable with database-level encryption by offloading encryption to a dedicated machine; however, some of these systems can be expensive and introduce an additional layer of interfaces to manage. Despite the management overhead that both software- and hardware-based encryption can impose, encryption offers the ultimate in data security and is attracting increased attention from organizations with a need to protect sensitive data.
ESRI
GIS seems to be deployed departmentally; can it be a strategic enterprise resource?
ESRI has developed its products basedon open standards, ensuring a high levelof interoperability across platforms,databases, development languages, andapplications. ESRI’s GIS applications areengineered to be an integral part of anorganization’s strategic IT infrastructure.Our use of Web standards such as XML,SOAP, UDDI and WSDL has led to successfulenterprise SOA implementationswhere specific location services such asgeocoding, are being used throughout theorganization. Our support of standarddeveloper environments including VB,C++, .NET, and Java has led to integrationswith all major BI platforms.
Analyst Viewpoint
To date, most GIS deployments in mainstreambusiness (outside of specializedutilities and energy systems, for instance)have focused on specific issues such asstore location, customer clustering, andlogistics. Lately, however, spatial data isincreasingly featured in BI dashboards andprovides business users with a visual meansof understanding and analyzing informationin a location-based context. Giving usersan opportunity to explore the capabilities ofGIS in dashboards is a good way to furtheradoption across the enterprise. Partnershipsbetween GIS and BI vendors have madespatial analysis an option for many organizations,and, interestingly, Web 2.0 sites suchas frappr.com are helping to put spatial datamore squarely on the map.
FAST
What exactly is “ad hoc query and reporting” and how can I use it to improve adoption of my BI solution?
Ad hoc tools allow users to spontaneously browse, slice and dice, and discover aggregate intelligence and detail evidence. This contrasts with more static BI models where subscribers receive reports predetermined by power users.
To provide such an application, relevant datastores (e.g., a data warehouse or internal/external sources) are connected to a data cleansing, text analytics, and highperformance index and query engine. On this platform, one or more ad hoc entry points can be provided (keyword querying, dynamically built drill-down trees, etc.) to access real-time aggregates and distribution charts.
With an ad hoc reporting tool, individuals can view the facts and figures relevant to their tasks, thereby increasing their performance and effectiveness. The combined intuitive simplicity and direct feedback are proven to dramatically increase user adoption of BI tools.
Analyst Viewpoint
Theoretically, ad hoc query and reportingempowers enterprising business users toexplore data by drilling down, slicing anddicing, and conducting what-if analysis. Thedividends can be huge if a user discoversan unrecognized relationship that bears onefficiency or the bottom line. On the otherhand, organizations need to recognize thatextending ad hoc query across a broad userbase can mean that some users will wastehours running queries of little businessvalue, sometimes merely to satisfy intellectualcuriosity. Organizations will profit by ajudicious allocation of ad hoc query capabilitiesand by monitoring the usage and overalleffectiveness of these analytic tools.
IBM
How important is metadata for successful data warehousing projects?
Extremely important. A centralized metadata repository for information integration — one that supports analysis and active sharing of metadata artifacts across a full range of integration activities and user roles — increases the trust and understanding of the information. These artifacts include: technical metadata about the various sources of information; business metadata that describes the business meaning and usage of information; and operational metadata that describes what happens within the integration process. As companies seek to gain a more comprehensive viewpoint of their information, they need a way to keep track of how these different sources are related and how information is being pulled together and presented in reporting and analysis tools.
Analyst Viewpoint
The need for regulatory compliance is a high-profile business driver that is prompting organizations to pay more attention to the management of metadata in data warehousing and operational systems. Leading metadata management tools feature robust lineage tracking, change analysis, and auditability features that can help organizations better account for the iterations and lifecycle of enterprise information at a granular metadata level. However, organizations need to be wary of the risk of architecting metadata silos and should strive towards standards-based, extensible metadata models that reduce, rather than add to, complexity in the data management infrastructure.
Informatica Corporation
What is the role of data integration for enterprise data warehousing?
A critical consideration for deploying enterprise data warehousing is creating an architecture that can support the data requirements of all types of applications. This is a departure from traditional thinking, in which the data warehouse was built to support a relatively limited set of reports and analytics, and was tightly coupled to the business intelligence application.
In contrast, the enterprise data warehousing (EDW) process must function atop a data integration platform that powers the ongoing lifecycle of data access, discovery, quality, integration, and delivery via data services that can support multiple applications as part of a serviceoriented architecture (SOA).
Analyst Viewpoint
Because it spans multiple divisions andgeographic locations and is aimed at improvingenterprise performance, enterprise datawarehousing puts an even greater premiumon the quality, consistency, and accuracy ofdata. As a result, the data integration processesthat support EDW are arguably themost important element in the EDW equation.Perfecting data integration technologiesand processes for EDW can be a significantchallenge given the breadth and complexityof enterprise data and the range of disparatesource systems. However, organizations thatsidestep this critical element introduce thesignificant risk of the “garbage in, garbageout” phenomenon and can undermine thevalue that EDW offers.
Information Builders
Use of business intelligence and analytics is not new. How have customers’ requirements forthese decision-support solutions changed in the last few years?
The change occurs in two areas: peopleand data. Organizations access and analyzeless than 20 percent of the data theycollect from their business processes andsystems. The nontechnical managers,operational employees, business partners,and customers demand it, and we havean opportunity to address the demandsof masses of users. The interface for BI isno longer a complex tool or a manuallygenerated spreadsheet; instead, it takesthe form of custom applications, dashboards,operational scorecards, maps,and data visualization elements. Dataarchitectures for BI must go beyond thetraditional data warehouse and incorporateoperational and third-party data, aswell as new types of data.
Analyst Viewpoint
Over the past few years, broadened BIadoption has given organizations a goodread on the insights and value that may begenerated by an effective BI deployment.Now customers are looking to build onsuccess in several key areas, including:1) enhancing the business relevance ofdata; 2) enabling more proactive decisionmaking;3) tightening real-time BI integrationwith business processes; and 4) improvingend-user trust in data accuracy. To achievethose objectives, customers are looking forstandards-based solutions that integrate inan SOA and reduce the complexity of siloedBI deployments, transitioning towards holisticenterprise BI systems.
Microsoft
How does Microsoft plan to democratize performance management?
Traditional inhibitors to the broad adoptionof performance management have been high costs associated with implementations,complex tools and user interfaces that require costly and time-consumingtraining for employees, and confusion over disparate systems and tools for thevarious capabilities, including planning, budgeting, forecasting, analytics andscorecarding. Performance management tools and processes traditionally havealso been siloed or stovepiped, meaning that they sit outside the day-to-daybusiness processes of most employees. Office PerformancePoint Server 2007was designed to address and eliminate these inhibitors, enabling performancemanagement across the enterprise, not just for the CFO and financial analyst.
Analyst Viewpoint
Giving lower-level employees tools thatencourage contributions to enterprise performanceis a sound idea. The concept of“democratizing performance management,”a key theme in Microsoft’s rollout of itsPerformancePoint CPM solution, will promptcustomer organizations to ask what performancemanagement means for a CFOversus a call center rep versus other userroles. Organizations that extend greaterperformance management to lower-levelemployees will need to safeguard informationaccess to ensure security of sensitivesales, productivity, and other performancerelatedinformation.
MicroStrategy
What’s the best way for executives to consume all ofthe data needed to manage their businesses?
Every day, executives and managers arefaced with the challenge of reviewing volumesof data to make important businessdecisions. Traditionally, managers wouldhave to find, run, and review dozens ofreports and piece together similar dataamong those reports. With the new breedof dynamic dashboards, information froma dozen or more reports can be compressedinto one expressive dashboard. Intuitiveon-dashboard controls allow users to fliprapidly through interrelated informationviews without changing focus or context.The dashboard buttons and selectors areeasy to use, so users don’t need training.Managers can now find all of the interrelateddata in a single dashboard.
Analyst Viewpoint
Dashboards and scorecards have emergedas preferred ways for many executives toview performance information and quicklyfind, analyze, and explore the data theyneed to perform their jobs. Which solutionis selected will depend on business needs,and it’s worthwhile to understand the subtledistinction between the two. Dashboards aregeared to inform users what the organizationis doing. Scorecards emphasize how wellthey are doing it by tracking progress againstmetrics. Clarifying business objectives fora dashboard (management) or scorecard(monitoring) at the start will help yourorganization select and deploy the optimalsolution for executives to consume and reactto business information.
Oracle
What can my organization gain by integrating business intelligence technologies with service-oriented architectures (SOA)?
First, it can gain an understanding of the business performance impact of business processes. Improving business results requires detailed business insight. Without enough insight into the impact and contribution of a business process on overall performance and goals, it is difficult to determine what changes should be made to improve business performance.
Second, facilitate business users taking actions as a result of insight. When business users gain insights that flag a business performance issue, the resulting corrective actions often require invoking a business process. These processes could now be started directly from the BI dashboards used to analyze performance, thereby improving efficiency and agility.
Analyst Viewpoint
Business intelligence is a natural fit for theservice-oriented architecture (SOA). Properlyimplemented, an SOA can provide aloosely coupled and extensible environmentin which previously siloed BI applicationscan interoperate — an important steptowards realizing the ideal of enterprise datawarehousing and enterprise performancemanagement. Query and reporting can berendered as services (executing in concertwith requisite data integration services)that readily extend the functional reachof analytics across a broader number ofsource systems. Start with small, quick-hitprojects that deliver high value and build afoundation of modular components and bestpractices that can be incrementally extendedacross your infrastructure.
Pentaho
What can BI vendors do to become more customer-centric?
Given the increasing adoption and strategicimportance of BI, customers arenow expecting greater transparency fromtheir BI vendors. In many instances,customers and prospects want access toproduct documentation, product roadmaps,and unfiltered feedback from otherusers, or even trial software for productsthat they have not purchased. Customersshouldn’t be afraid to ask for any of these,and vendors shouldn’t hesitate to providethem. BI licensing is another area thatneeds more transparency. Customers haterepurchasing products or paying enablementfees for products they thought theyalready had paid for.
Analyst Viewpoint
Steady growth in the availability and functionalityof open source BI software is beginning to drive greater customer-centricityin the BI market. Traditional BI vendorsare having to rethink licensing terms, freetrials, upgrade cycles, and source documentationas the number of open source BI providers grows and more enterprisesexplore open source BI as a lower-cost,quick-hit alternative to traditional solutions.On the technology side, commercial vendorsare challenged to answer increased customerdemand for highly flexible, SOA-ready applications based on open standards thatcan be readily embedded in existing systems.As open source BI continues to evolve, transparency and flexibility in meeting customerexpectations will be key.
SAP
Why is it important to empower business users to own and manage CPM solutions?
Businesses change continuously. ITdoesn’t understand nuances of thechanges, and the business users can’timplement them. As a result, systems aremonths behind business requirements.It is important that business users areempowered to react to change and implementthe changes on their own. This ismost critical for corporate performancemanagement (CPM) solutions. With budgetingor variance analysis reporting, or astrategy review, needs of users dynamicand often ad hoc; it is essential that businessusers model and implement changes.CPM offerings enable the business user toown, model, and maintain the solutions.This, along with the advantage of integrationwith ERP and built-in governance,risk, and compliance (GRC), can helpbusiness users implement changes withoutdepending on IT.
Analyst Viewpoint
The desirability of business owners using andmanaging corporate performance managementsolutions (CPM) reflects a broader trendtowards greater business-side involvement inand management of BI systems. Organizationsare gravitating toward a more user-centric BImodel and away from the IT-led, systems-orienteddata management practices of the past.Achieving this ideal requires greater collaborationbetween business and IT, with tech-savvybusiness sponsors who can bridge the gapbetween corporate objectives and supportingtechnology. CPM systems are an obvioustarget for business ownership because theyare near the top of the BI food chain and areexpected to enable game-changing insightsby putting useful performance managementinformation in the hands of executivesand managers.
SPSS
How can I extract intelligence from textual data?
Text is very different from the conventional, structured data held in customer data warehouses, but it is rich in context. You can mine text (for example, call center notes or responses to survey questions) to extract concepts — the most important and relevant words and phrases. These concepts, along with information on how they are linked to each other and the sentiment (positive/ negative) associated with them, are effectively structured data. They can be merged with behavioral, descriptive, and attitudinal data to build a holistic customer view for predictive analytics./
Analyst Viewpoint
Vendors have made significant strides in thesophistication and functionality of text analyticssoftware in recent years. The “how”of text mining will vary by product, but the“why” is common across many organizations:unstructured and semistructured datais necessary for organizations that want toachieve a fully mature view of enterpriseperformance. TDWI research shows thatorganizations’ use of unstructured datain data warehouses is on the rise, whilevendors and data warehousing professionalsare working to refine techniques andtechnologies needed to transform it intostructures meaningful to a warehouse orto a reporting tool.
Sybase
How can I increase user scalability in a distributed analytics environment?
A Sybase IQ multiplex configuration provides virtually unlimited user scalability. To increase the number of users, add a Sybase IQ node running on an additional server. Each node can support different users or manage different reporting or query workloads. Nodes can also partition the user base by service quality, so that higher performing nodes can support high value users or applications.
Multiplexing enables Sybase IQ to optimize workloads across multiple servers within a single environment. With one writer node and multiple reader nodes, adding additional applications and users to a database can be solved simply by adding reader nodes. The impact is minimal, as nodes do not depend upon or interfere with each other.
Analyst Viewpoint
Grid computing environments are gainingattention as viable platforms to enhancescalability for large-scale data warehousingas RDBMS, BI, and data integration vendorsincreasingly add native functionality to takeadvantage of multi-node clusters. Theseplatforms will feature load balancing algorithmsto distribute queries across nodesbased on resource utilization (e.g., CPUusage, memory, disk speed), and enableadministrators to prioritize task executionbased on user role, time of day, short- andlong-running queries, and other criteria.Naturally, diligent trial-and-error applicationof performance tuning techniques, partitioning,and parallel processing can also paysignificant scalability dividends. Apart fromscalability, grid systems offer an opportunityto build in high availability and failover tobetter ensure system continuity.
Syncsort Incorporated
As data volumes continue to grow, uploading data to the data warehouse has become much too time-consuming. Is there away to speed this process?
Uploading data can be simplified by using a data management solution with changed data capture (CDC). Also known as delta processing, the technique compares old and new files to identify changes. Only the changed data is updated in the master file, reducing the amount of data uploaded to a data warehouse, thus improving performance.
In a simple example, create a join using yesterday’s transaction file as one source and today’s transaction file as the other. Next, define the join key from yesterday’s data to be the entire record, and define the join key from today’s transaction file to be today’s entire record. Keep matched records, unmatched records from yesterday’s data, and unmatched records from today’s data. Create three conditions (Added, Deleted, and Updated) using a tool such as Syncsort’s DMExpress. Finally, add a target file and partition the data by these three conditions, creating one output file for each condition.
Analyst Viewpoint
Though not new, changed data capture(CDC) has found its sweet spot with theadvent of the real-time data warehouse.Available as a built-in or optional componentin leading ETL tools, CDC can drasticallyreduce the volume of data to be moved intoa warehouse. Reducing data loads by 85percent can reduce load times at a similarrate, making multiple intraday loads practical.The additional benefits of minimizing impacton operational applications, bandwidth, andI/O make CDC worth considering by any organizationthat wants to freshen its warehousedata to enhance its business value.
TCS
How do we enhance the level of end-user confidence in the data warehouse?
To increase end users’ confidence in thedata warehouse, it is imperative to builda transparent audit process using businesscontrols for the data integration.Business controls are essentially sets ofdata elements at the higher levels of grainthat determine the business performance,e.g., sales revenue or number of new businessacquired. It is essential to establishan automated balancing control processthrough alternate data flows and to establisha baseline for correctness. Approachessuch as general ledger balancing for financialmetrics also help.
Analyst Viewpoint
It’s essential first to gauge the degree ofend-user confidence in a data warehouse,and to pinpoint specific issues that may beundermining user confidence (and causingunderutilization of the system). Dependingon the gravity of the situation, this effortcould be either an informal survey andfollow-up, or a structured program thatspans multiple divisions and extends forseveral months. By assessing end-userconfidence in such areas as data quality,data timeliness, ease of use, and query/reporting effectiveness, organizations canbuild a roadmap to attack problem areasand continually enhance the business valueof DW systems.
Next
Previous
Back to Table of Contents