Executive Summary | Harnessing the Power of Diverse Data for Business Growth
This TDWI Best Practices Report focuses on understanding current challenges when using diverse data types.
- By Fern Halper, Ph.D.
- November 8, 2023
Today’s competitive business landscape demands comprehensive, data-driven insights. Enterprises are beginning to utilize diverse data—which includes structured, semistructured, and unstructured forms—to fuel these insights. Diverse data is extremely important for enriching data sets for analysis as well as promoting innovation in companies. Yet, results from this research indicate that it is still relatively early days for making the most out of diverse data.
Although survey respondents are using all kinds of data for understanding customers and improving operational efficiencies, they are facing challenges. On the data management front, these challenges include unifying diverse data for analysis, securing the data, and dealing with complex pipelines. On the analytics front, challenges include determining data quality for diverse data, building talent, and finding tools to help them analyze diverse data. They are also working out how to process diverse data—for instance, whether on a platform or in a pipeline. Respondents cite numerous ways they are trying to deal with these challenges.
Respondents are also determining how to govern diverse data. They realize that new data types mean new considerations for governance. For instance, what does “good” data quality mean for unstructured text documents or images? They are working through data quality and compliance issues along with putting policies in place for diverse data, including treating it responsibly.
Yet, although there are challenges, respondents cite numerous opportunities for diverse data. These include: having more accurate analytics for better customer insights, improving operational efficiencies, creating a more data-driven culture, and improving innovation and collaboration. New approaches such as generative AI may also impact analysis and innovation with diverse data.
Some key findings include:
- Fifty-three percent of respondents cited better understanding of customers as the top driver for diverse data; 43% cited driving operational efficiencies
- Most respondents are collecting structured data; text data is already mainstream (32% of respondents); other unstructured data types are only beginning to enter mainstream adoption
- Cloud data platforms and hybrid environments are being used to manage diverse data, but new platforms such as vector databases and graph databases are also in the mix
- One-third (30%) of respondents are using data marketplaces to source diverse data; these respondents are more likely to monetize their data
- Less than half (40%) of respondents cited unifying diverse data for analysis as a top data management challenge; 59% of respondents cited data quality issues as a top analytics challenge
- Forty-eight percent of respondents agreed that generative AI might be a game-changer for analyzing unstructured data
Snowflake sponsored the research and writing of this report.
Fern Halper, Ph.D., is vice president and senior director of TDWI Research for advanced analytics. She is well known in the analytics community, having been published hundreds of times on data mining and information technology over the past 20 years. Halper is also co-author of several Dummies books on cloud computing and big data. She focuses on advanced analytics, including predictive analytics, text and social media analysis, machine-learning, AI, cognitive computing, and big data analytics approaches. She has been a partner at industry analyst firm Hurwitz & Associates and a lead data analyst for Bell Labs. Her Ph.D. is from Texas A&M University. You can reach her by email ([email protected]), on Twitter (twitter.com/fhalper), and on LinkedIn (linkedin.com/in/fbhalper).