Self-Service Data Access: The Foundation for Emerging Analytics
New practices such as self-service data prep, exploration, and discovery don't work well (or at all) without new ways of making data more accessible.
- By Philip Russom
- September 14, 2016
For years we lived with the old paradigm: a data provider (a highly technical person) taking weeks or months to create data sets for other technical users or for business end users.
Recently this has given way to a new paradigm, where a wide range of user types -- from highly technical data scientists and analysts to mildly technical business analysts and managers -- have the right tools and skills to do their own data provisioning. The result is greater speed, agility, exploration, and discovery than ever before for these user types.
Trends Driving Self-Service Practices
A number of trends have driven up the importance of self-service data access:
Fast pace of business: End users cannot afford to wait a few weeks while IT or a data management team creates a unique data set for them.
New data sources to leverage: Many organizations are collecting big data, which tends to come from new sources (e.g., Web, machines, or devices). New data leads to new insights and operational advantages, but only if users can work with it.
More business users are technical enough for hands-on data work: Business people depend more than ever on data for running their departments and units. Many have learned data basics so they can construct their own queries, models, and data sets -- when given appropriate tools and data platforms.
Self-Service Functions Are Growing
In response to users' needs, a variety of tool vendors now support some form of self-service, as seen in tools for data visualization, analytics, enterprise business intelligence, and data integration. Self-service is enabled by selecting just the right subsets of technical functions and making them easy to use.
Self-service functions, in turn, enable a variety of emerging analytics practices, including data exploration, discovery, data prep, visualization, and various forms of analytics.
Note that these emerging practices do not work well -- or at all -- without self-service data access. After all, how can mildly technical users explore, prepare, or analyze data that they cannot access easily? For this reason, self-service data access is the foundation for other emerging self-service practices.
Enabling Self-Service Data Access
For self-service data access to work properly, a number of things need to be in place.
Data stores for self-service: Technical users integrate (but don't aggregate) the data that known user constituencies need for self-service tasks, from exploration to analysis. Depending on the data and the types of exploration and analytics the users want to perform, the resulting data stores range from data warehouses and marts to operational data stores and data hubs to data lakes on Hadoop.
Business metadata: We say "metadata" as if it's one monolithic thing, whereas our uses for it demand multiple forms. These include technical metadata (for automated software access), business metadata (human language descriptions that business people can understand), and operational metadata (which records details about a data access event).
Among these, business metadata is a critical success factor and therefore a firm requirement for self-service data access. Without it, few business people (or other less technical user types) can even browse data, much less construct queries, data sets, and analyses.
Controls for access and use of data: Self-service data access risks compliance infractions, privacy violations, unauthorized access by employees, and security breaches. These risks can be managed and mitigated by a data governance program, assisted by the governance, security, and auditing features built into modern tools and data platforms.
For example, many users employ operational metadata as an audit trail for examining data access and usage. Analytics sandboxing is an emerging practice that isolates sensitive data in a closed environment to control the distribution of data sets and analyses based on sensitive data. These tool functions and practices can apply to any data store, including those for self-service data access.
Self-Service Is Worth the Effort
Creating an environment conducive to self-service first requires making your data accessible to all your users. Self-service data access and related emerging practices will be worth the effort; you will see a variety of users benefit by gaining greater agility and increasing discovery.
Philip Russom is director of TDWI Research for data management and oversees many of TDWI’s research-oriented publications, services, and events. He is a well-known figure in data warehousing and business intelligence, having published over 500 research reports, magazine articles, opinion columns, speeches, Webinars, and more. Before joining TDWI in 2005, Russom was an industry analyst covering BI at Forrester Research and Giga Information Group. He also ran his own business as an independent industry analyst and BI consultant and was a contributing editor with leading IT magazines. Before that, Russom worked in technical and marketing positions for various database vendors. You can reach him at firstname.lastname@example.org, @prussom on Twitter, and on LinkedIn at linkedin.com/in/philiprussom.