July 1, 2016
Business users want the power of analytics—but analytics can only be as good as the data. To perform data discovery and exploration, use analytics to define desired business outcomes, and derive insights to help attain those outcomes, users need good, relevant data. Executives, managers, and other professionals are reaching for self-service technologies so they can be less reliant on IT and move into advanced analytics formerly limited to data scientists and statisticians. However, the biggest challenge nontechnical users are encountering is the same one that has been a steep challenge for data scientists: slow, difficult, and tedious data preparation.
Data preparation is a hot topic on both business and IT sides of organizations. It is also the focus of innovative software technology and methods aimed at accelerating, if not automating, processes necessary to support business analytics. Preparing, blending, integrating, cleansing, transforming, governing, and defining the metadata of multiple sources of data—including new, raw big data in Hadoop—has been primarily an IT job; however, broadening interest in data science and analytics has drawn non-IT personnel into the execution of these tasks. Non-IT users such as business and data analysts as well as developers are looking for smarter self-service tools that reduce difficulties and make data preparation processes faster. IT, meanwhile, is interested in tools that can streamline data preparation, improve productivity, and enable IT to serve users better.
This TDWI Best Practices Report examines experiences with data preparation, discusses goals and objectives, and looks at important technology trends reshaping data preparation processes. From small organizations using spreadsheets and visual discovery tools to large enterprises trying to improve data quality and delivery for a variety of uses including business intelligence (BI) and advanced visual analytics, data preparation difficulties are a major concern. We find strong interest in improving data preparation and increasing self-service capabilities so that business users and analysts can do more on their own to prepare data without IT hand-holding.
In our report, TDWI Research advises organizations to focus improvements on reducing the time it takes to prepare data, which can help users realize insights from data faster. Users are weary of long and repetitive data preparation processes. Rapidly evolving technology is enabling organizations to automate and standardize steps as well as build knowledge about the data for better reuse and sharing, transformations, and analysis. TDWI Research additionally advocates integrating data preparation with governance programs. As organizations increase self-service data use, they need to ensure that users observe good governance and support stewardship of data assets. Clearly documenting data preparation steps can be helpful in making governance more effective and giving users the confidence that they are working with trusted data in BI and analytics projects.