Reeling in Spreadmarts
By Wayne Eckerson, Director of Research, TDWI
The Proliferation of Spreadmarts
In 2002, I created the term “spreadmart” to describe the renegade spreadsheets and desktop databases that are wreaking havoc on organizations.1 Since then, many people have adopted the term because it highlights a painful, yet largely ignored, problem that plagues organizations today.
Spreadmarts contain vital pieces of corporate data that are needed to run the business. But since spreadmarts are created by individuals at different times using different data sources and rules for defining metrics, they create a fractured view of the enterprise. Without a single version of corporate data and centrally defined metrics, employees can’t share a common understanding the business. With spreadmarts, each worker marches to the “beat of their own drummer” instead of marching together toward a common goal. In short, spreadmarts undermine corporate productivity and sabotage business alignment
Today, spreadmarts are the bane of IT departments who can’t control their proliferation, and the nemesis of CEOs who can’t gain an accurate view of enterprise activity because of them. In many respects, spreadmarts are the corporate equivalent of terrorists—just as soon as you eliminate one, ten more spreadmarts pop up to take its place.
TDWI’s research report, “In Search of a Single Version of Truth: Strategies for Consolidating Analytic Silos,” shows that organizations have on average 28.5 spreadmarts that they want to consolidate.2 In reality, most organizations have no idea how many spreadmarts they have, and some are afraid to count!
For example, one large retailer scanned its file systems and discovered 100,000 distinct instances of Microsoft’ Access desktop database. Although some of the databases were embedded in applications, most were not. Claiming it suffered from “Access Anarchy,” the firm decided to do something to reel in these debilitating spreadmarts.
Controlling Spreadmarts
However, recognizing that you have a spreadmart problem and fixing it are two different matters. Some companies believe that eradicating spreadmarts is impossible and don’t even try. Of course, this approach is shortsighted; unless you develop a strategy to deal with spreadmarts, they will end up choking your business like an invasive weed.
One reason spreadmarts are so difficult to stamp out is because the tools used to create them are so easy to use. Microsoft’s Excel spreadsheet and Access database have succeeded beyond the imagination of even the most rosy-eyed Microsoft developer. They have greatly empowered individual business professionals, but at a tremendous cost to enterprise productivity and alignment.
The problem with Excel and Access is that these tools make it too easy for today’s knowledge workers to do four things: collect, integrate, define, and analyze data. While knowledge workers are paid to do the last item (analyze data), they are not hired to do the other three. Presumably, organizations maintain IT departments to collect and integrate data, and they assign data stewards (i.e. individuals, departments, or teams) to define and manage the meaning of corporate metrics and data.
Another reason it is difficult to wean users from spreadmarts is that corporate BI environments often lack sufficient functionality. If a BI tool doesn’t provide enough analytical power, users will simply dump the results of a BI query into Excel for analysis. In addition, if the information architecture doesn’t let power users access data from multiple sources in a timely fashion, users will be tempted to use Excel or Access to circumvent IT to get the data they need, when they need it.
So, here’s the crux of the problem. How do you ensure knowledge workers use Excel and Access to analyze data but not collect, integrate, or define it? How do you wean them off the more pernicious qualities of spreadsheets and personal databases without taking away their ability to effectively analyze data? There are three options: coercion, conversion, and co-existence.
Coercion
Coercion is the least effective strategy. Unless you work for the military where high-level commands are enforced throughout the rank and file, you can’t banish Excel and Access from corporate desktops unless you want an angry mob on your hands. In fact, coercion usually makes the problem worse. Users go underground with their data, managing their divisions and departments with clandestine spreadmarts that run parallel to the “official” corporate systems.
One symptom of this problem is poor query performance. Secretive spreadmart users will simply use a BI product as an ETL tool to create a data set that they want to analyze in Excel. Often, they submit “runaway” queries that bog down query performance for everyone. This forces IT administrators to constrain data access or limit the size of result sets, causing power users to take more drastic measures to circumvent IT and its information architecture. The old adage “What you resist, persists” certainly applies to spreadmarts.
Conversion
Conversion is a better strategy. Most organizations already have a commercial BI tool—or two or three or more. The conversion strategy involves persuading knowledge workers to use the organization’s standard BI tool to access data and perform standard analytical tasks rather than Excel or Access. This can be a hard sell. First, you need to make sure the BI environment provides at least 150 percent of the value of the spreadmarts. Then, you need to relentlessly sell the benefits of the BI environment. Here are some tips on selling your high-value BI environment:
- Saves Time. Tell spreadmart users that the BI tool runs on top of a data warehouse that collects and integrates data across more subject areas and data sources than they could possibly access and integrate on their own using Excel or Access. Calculate the amount of time and money they will save by converting to the standard BI environment.
- Validate Data Quality. Show spreadmart users how the data in the warehouse is much cleaner and more accurate than the data in their spreadsheets. (But be careful, they might not like having you expose the errors of their ways!)
- Deeper Insights. Show them how analyzing data across a multiplicity of subject areas (rather than one or two in spreadmarts) can lead to deeper insights and enhance their careers.
- Comparable Functions. Show them that the BI tool provides the same analytical functions and features as their spreadmart, such as briefing books, charting, report manipulation, offline usage, and others.
- Additional Functions. Show them the additional functions that the BI tool supports, such as the ability to schedule, share, or annotate reports, collaborate with colleagues, and publish reports to a portal, among other things.
- Better Support. Remind them that IT will support them every step of the way should they run into problems with the tool or the data but that they won’t get any support for their spreadmarts.
Persuading hard-core spreadmart users to convert to a BI environment is not easy. However, with a robust BI environment, a vigilant sales campaign and top management support, these hold-outs will eventually convert to the new environment.
Co-existence
The best strategy to eliminate spreadmarts is to co-opt them. You do this by making Excel a true client to your BI server. Here, knowledge workers use Excel to access to data and reports that are governed by a central server administered by the IT department. The client/server interface is hardwired into user desktops using a Web Service or Excel plug-in. Users get all the analytical features of a spreadsheet that they know and love, but the organization gets to manage access to data and reports in a controlled manner. This co-existence strategy is known as a “managed spreadsheet” approach. It is really an extension of the “conversion” strategy above.
Two Options. The co-existence strategy gives organizations two levels of control over the way users access data. In a tightly controlled managed spreadsheet environment, the IT department can create and run reports on the server which the Excel client can then access and interact with. The users cannot use Excel to access data directly, only predefined reports. This lets Excel users view and interact with any report on the BI server, whether or not it was created or modified to work in Excel.
In a more open environment, the IT department sets up a semantic layer consisting of “information objects” that represent data elements and measures in one or more source systems. Users drag and drop these objects into an Excel query builder to create a custom query and report. Since IT defines the objects, it ensures that all metrics and data elements are defined, calculated, and accessed in a consistent fashion.
Ideally, the semantic layer possesses a distributed query and join capability (i.e. an enterprise information integration tool) that enables users to create a report or table that blends data from multiple systems. This lets them for example, combine historical data from a data warehouse and real-time data from an operational system. Only the most recalcitrant and resourceful users will want or need to bypass this semantic layer to obtain data on their own.
Improved Excel Integration. This past year, many leading BI vendors have greatly enhanced their Excel integration capabilities. They have transformed these modules from static import/export mechanisms to true front-ends to analytical servers. Most importantly, Excel users can access data in existing BI reports (i.e. native BI report formats) as if the reports were created in Excel. That is, charts, formulas, and formats are converted to native Excel formats. (In contrast, first generation Excel “add-ins” simply dumped raw data into Excel and, at best, converted charts to static .JPG images.)
Endorsement from the Top is Mandatory!
Although we’ve defined three strategies for taming runaway spreadmarts, none is effective unless top management provides a clear and unambiguous signal that it will no longer tolerate spreadmarts.
One way for executives to do this is to stand behind a corporate BI standard—and more importantly—use the BI tool themselves or its output to run the business. It is also important that executives force representatives from each business unit or division to meet together until they hash out definitions and rules for common metrics and terms used in the BI environment. These actions serve as a catalyst to change individual attitudes and habits and free the organization of its relentless addiction to spreadmarts.
Whatever strategies you use to deliver a single version of the truth, the key is to be patient. Analytic habits don't change overnight. It will also take time for groups to transition their turf wars to other battlefields. With a heavy dose of patience, strong communications skills, and robust data warehousing environment, you should be able to tame the spreadsheet jockeys, or better yet, convert them into enthusiastic proselytizers of the new BI environment.
1See “Taming Spreadsheet Jockeys,” “TDWI Case Studies and Solutions,” TDWI e-newsletter, July, 2002.
2The report also shows that organizations have 2.1 data warehouses, six independent data marts and 4.5 operational data stores that they want to consolidate.
(Article appeared in "Case Studies & Solutions," July 20, 2004)