Preparing Data for the Self-Service Analytics Experience
Self-service tools should help IT offload some responsibilities to users, thereby reducing the backlog and giving users the flexibility they seek.
- By David Stodder
- June 2, 2015
Washington Irving's legendary character Rip Van Winkle slept through massive changes to his life in 18thcentury America, including the American Revolution. I have friends who have slept through earthquakes, big storms that knocked out power, and moments of domestic drama. However, I doubt many experienced professionals in the business intelligence (BI) and data warehousing community have been able to shut out the noise and sleep through the strong and well-marketed trend away from IT control and toward business-driven, self-service BI and analytics.
Data assets play an ever-greater role in business decision making; business users want more data and more control over how they access, analyze, present, and share it. The technology community has responded by providing easier-to-use data visualization, exploration, and preparation tools that require less IT handholding. It is one of those times when consumer demand and technology advancement are driving a trend together.
TDWI is covering this trend, and has reshaped its educational and training offerings to help both IT professionals and business users learn about analytics and prepare their firm's data assets, including data warehouses, Hadoop systems, and the rest to handle widespread, "democratized" data access for users' self-service analytics processes. Please check out the upcoming TDWI Boston Conference (July 26-31, 2015), which is focused on building knowledge about how to execute high-value analytics programs.
Avoiding the Road to Data Perdition
BI and analytic data discovery tools may be getting easier to use, but the journey toward BI and analytics always starts with the data. Many users -- and many IT organizations -- make the move toward better tools when data volume, quality, and latency issues overwhelm current practices and become a threat to the organization's ability to meet its objectives. Small and midsize firms often begin with no IT function to manage the data, or at least no one with experience to take responsibility over data assets.
Frequently, all users have to work with are spreadsheets and limited reports based on disparate, application-specific databases. Self-service BI and data discovery tools can deliver much better visualization and data exploration, but the sources often remain limited to spreadsheets and siloed application-specific databases.
At larger firms, even if there is an enterprise BI standard, users grow tired of waiting: waiting for IT to find development time to address requests for BI reports and dashboards and waiting for IT to find systems time to run the reports and queries. Of course, once this is all set up, users frequently decide that they want different data or different queries and visualizations and the process must start over.
Users in these organizations will reach for self-service BI and analytics tools out of frustration with the IT backlog and the lack of flexibility. The tools allow them to explore data and experiment with visualizations on their own. Yet, such users are often still limited in terms of the sources they can access -- usually to what is accessible in the data warehouse, historical BI reports, or their own spreadsheets. Users may reach out to access data sources that IT has not sanctioned that may be held in flat files or are sourced externally, but their dubious quality can make it hard to integrate the data with sanctioned sources without special effort.
Automating Data Preparation
The TDWI Boston Conference will have instructor-led and case study sessions aimed at helping users and data professionals solve data problems that limit what users can do with self-service BI and analytics. The need to solve such problems is fueling a second hot trend toward self-service data preparation and user-driven data blending, munging, and integration. This trend will no doubt be discussed during sessions and on the exhibit floor at the TDWI Boston Conference.
TDWI Research finds that in most organizations, IT remains primarily responsible for preparing data, which can include cleansing and enriching the data, consolidating records, and creating calculated fields, aggregations, and dimensions. IT also remains primarily responsible for data extraction, transformation, and loading (ETL). Many of the new tools essentially aim at building IT's intelligence into the software; vendors are applying analytics to data preparation through use of machine learning and other algorithms. Some tools additionally have wizards that can recommend sources for certain types of analysis or the front-end visual analytics and data discovery tools to provide them.
It may be too soon to tell whether automation in the form of tools for self-service BI, visual analytics, and data discovery integrated with self-service data preparation technologies can handle the complexity of requests that IT usually gets for data preparation and ETL or can deal with the performance, data volume, and data variety challenges that are growing as more users seek to apply analytics. My take is that IT and self-service tools will continue to coexist. The tools should help IT offload some responsibilities to users and thereby reduce the backlog and give users the flexibility they seek.
Stay tuned to TDWI to help you set your data strategy and take advantage of technology advancements that show promise in making users and IT more productive and effective. These are changes that you will not want to sleep through.