Self-Service BI: Barriers, Benefits, and Best Practices
What is driving the promise of self-service BI, how is the landscape changing, and what best practices can help you succeed?
- By Jim Gallo
- April 24, 2018
Self-service BI isn't new. IT departments have long wanted to be freed from running queries, generating reports, and building dashboards. Some users embrace the idea of being more self-sufficient in meeting their information needs. Other users, however, lack the skills, training, tools, or inclination to be self-reliant. Historically, the self-service concept is appealing but has often fallen short of expectations.
What's behind the renewed interest in self-service BI? It started four or five years ago with new and emerging technologies, although recent innovations are sparking interest from a different perspective. There were two major contributing factors from 2010 - 2012. First was the emergence of the nontraditional BI vendors that marketed themselves directly to businesses as having easy-to-use tools that allowed individuals to download licenses one at a time, thus creating crucial mass within the larger enterprise.
The second factor was overhyping by the big data platform companies that promised the miraculous ability to ingest data into a schema-less environment. Any business user could then instantly point their self-service tools at the data for analytics and reporting. I clearly remember sitting in on product presentations and webinars where product CTOs and some well-known industry pundits were selling this snake oil to the masses.
More recently, the rise of the data marketplace as the latest business enabler has taken the concept of self-service to a new level. To wit, data marketplace products support, among other things, self-service data integration (profiling, cleansing, metadata capture and definition, ETL, etc.), where anyone can create data sets that can be cataloged and used by all data consumers.
There are few barriers for simple self-service BI, but there are major barriers to the successful adoption of vetted, effective self-service BI if you consider the veracity of the data used within analyses and reports.
Basic self-service BI has been around since the dawn of computing and continues to manifest as spreadsheets, databases built by individuals to suit their specific needs, and the indiscriminate use of BI tools against any data set regardless of origin. The full instantiation of clearly defined self-service BI—where semantic layers are created to provide business-friendly access to complex databases and everyone can build dashboards and reports or run ad hoc queries as needed—is more challenging.
The barriers to self-service BI using unified, trusted, vetted, and approved data sets are much the same as traditional BI:
Speed to value. The time it takes to create complete and consistent data sets -- whether in the form of a data warehouse, data mart, data lake, or data marketplace file -- that allows users to conduct analyses and generate reports rather than spending most of their time wrangling the data.
Morph speed. The time it takes IT to add tables and columns to existing data stores and to modify semantic layers.
Lack of commitment. The unwillingness to make a long-term investment and an organizational commitment to trusted data stores and governed metrics, and the reluctance to dismantle shadow IT organizations.
Data security. Protecting data sets that reside on mainframes, servers, workstations, and everywhere in between—as well as identifying, isolating, and obfuscating PII, HIPAA, and other types of personal information while at rest, in use, and in flight.
Data quality and consistency. The bane of the analytics industry. Let's face it, the concept of a "single version of the truth" has been around since the mid-80s, and self-service BI has been possible for at least the past 20 years. Yet, after all that time, organizations are in the same predicament. As many organizations have learned, the preponderance of the time, cost, and risk of BI-related initiatives comes from data integration due to a lack of information quality and consistency. Relatively speaking, the effort associated with enabling self-service BI pales in comparison. Seriously, when's the last time a BI program failed because it took too long to create a dashboard or report?
On a positive note, enterprises are changing their thinking, and technology is also helping them support self-service BI.
Among the changes I see:
More enterprises are embracing "fit for purpose." There is a growing recognition that it's okay to create different types of data stores that trade off trust against agility. Use cases that require a high degree of data rigor (such as financial and compliance reporting) should leverage trusted data stores such as an operational data store or a data warehouse. Needs that require an expedient answer, where hypotheses or "close enough" answers are sufficient, can leverage analytical sandboxes, data lakes, or staging areas.
IT support is still needed. We've seen the tools that were created to address the self-service market become more complex and need IT support. The growth in self-service usage and popularity brought with it the need for server-class products and the need to scale. In other words, self-service products are beginning to look just like the enterprise-class tools that have been around since the 90s that require IT support and architectural constructs to serve the masses.
The need for data governance isn't going away. In the past two to three years I've seen a marked increase in organizations that are finally getting serious about data governance. This stems in part from a recognition that self-service BI in and of itself does not solve the data quality, shared metrics, or veracity problems that fueled the data warehousing industry.
The emerging data marketplace solutions offer hope for a much-needed middle ground. I've always believed that data quality is in the eye of the beholder and if a self-service data set needs to be curated, in part or in whole, the business community will address the need. At the same time, data stores that require a high degree of trust will continue to evolve and will be included in the marketplace catalog as an information source available for self-service BI. Perhaps this is the place where IT and shadow IT can finally coexist peacefully.
The Role of the BI/Analytics Director
The BI or analytics director should enable the creation and use of different types of data stores and serve as a triage specialist. On demand, the director should turn his/her attention to working with the data consumer(s) to understand where on the trust versus agility curve the data needs fall, and then provide the most expedient path forward, whether through a formal governance and development cycle or through standing up a lake or staging area that contains data loaded directly from the source, regardless of quality and completeness.
In the event a data marketplace is created, the director's role should be to mentor and coach data consumers about best practices for requisitioning, curating, and cataloging data sets for maximum reuse.
Self-Service Best Practices
Once self-service BI is in place, enterprises will find the following best practices helpful:
Parameterization. Use parameters as often as possible to drive flexibility. As an example, one parameter-driven report that allows users to select the desired time, region, and/or product(s) is preferable to recreating the same report for each variation of time, region, and product.
Drop-down lists. Leveraging drop-down lists built from the database tables increases parameter selection accuracy than does typing in parameters.
Attribution. Include some form of attribution on the report to signify the trustworthiness of the contents. For example, include a seal of approval or "Created from EDW" in the report footer to signify that the report contains information from trusted/certified sources, and exclude such attribution when the data did not come from a trusted source.
Algorithm management. Organizations spend considerable time and money to get the data right, but often lose sight of the fact that data can also be created within reports and dashboards. The self-service community should agree on and publish common algorithms so they can be used across the enterprise.
Catalogs. Catalog reports and dashboards so users may determine if an existing analysis is available before creating something new.
Design for the ultimate consumer. The type of information consumer and the mode of consumption should be considered when reports and dashboards are created. Executives may wish to consume information in a visualization or infographic while a financial analyst may prefer detailed reports filled with numbers. Similarly, a dashboard intended for consumption on a mobile device should contain less information per screen and be easily navigable given the form factor of the device itself.