Three Essential Trends for Data Leaders in 2024
Here’s what to look for -- and suggested actions to take -- when it comes to LLMs, verticalization within your data infrastructure, and consolidation of data operations.
- By Kyle Kirwan
- January 8, 2024
As we step into 2024, the data and analytics landscape continues to evolve, presenting data leaders new opportunities as well as new challenges.
Three pivotal trends are poised to significantly impact the industry this upcoming year. Let's delve deeper into these trends and explore the implications for data teams.
Trend #1: RAG-Based LLMs
Large language models (LLMs) took center stage in 2023, transforming how data is processed and analyzed on a global scale. However, alongside their rapid adoption came some notable challenges. Although casual users of generative AI apps such as ChatGPT might not see the extent of these issues, when it comes to enterprise data applications, the effects of hallucinations and training-period limitations can wreak havoc.
Enter retrieve, answer, generate (RAG) models, a promising solution to address these issues and potentially revolutionize data accessibility within enterprises.
RAG models combat the challenges of AI hallucinations by providing auditable and up-to-date information. These models enable access to external data stores, ensuring the information provided is reliable and current. For data professionals, understanding and harnessing the potential of RAG-based LLMs is pivotal because these models could significantly enhance the reliability and relevance of insights derived from them.
Actionable advice: Embrace the adoption of RAG-based LLMs. Explore training in this area and consider implementing these models for key data initiatives to improve response accuracy, reduce hallucinations, and ensure the information provided is up-to-date and auditable.
Trend #2: Verticalization within data infrastructure providers
The trend towards verticalization within data infrastructure providers has been steadily gaining momentum, marked by significant acquisitions in recent years -- for example, Databricks buying Arcion and Mosaic earlier this year, Snowflake purchasing Neeva and Streamlit, and DBT acquiring Transform.
These acquisitions represent a move towards vertical integration, with large cloud providers aiming to offer comprehensive solutions within the data ecosystem. This not only redefines the market landscape but also presents new possibilities for integrating and utilizing these platforms effectively.
Actionable advice: Monitor how these integrations might impact your data operations tools and processes. Assess how the changing landscape might offer new solutions or alter the functionalities of existing tools in your data stack. Explore native capabilities from your cloud provider and determine when you can consolidate your tooling and when it makes sense to seek out an independent, best-of-breed solution.
Trend #3: Verticalization in data operations
The data operations sector is experiencing a parallel pattern, with startups introduced between 2020 and 2022 now reaching the end of their runways. This is especially evident through a string of acquisitions facilitated by major industry players. Notably, companies such as IBM, Teradata, Collibra, and Bigeye strategically acquired other firms including Manta, Stemma, OwlDQ, SQLDep, and Data Advantage Group.
This marks a self-correcting trajectory within the industry, where the influx of data tools that came onto the market several years ago is consolidating into just a few key players.
Actionable advice: Fully evaluate the financial health, reputation, and strategic alignment of potential vendors. Ensure that your chosen partners can maintain services, provide consistent support, and align with your organization's long-term goals. Carefully review existing integrations when considering a data operations vendor to ensure the solution will work seamlessly in your data stack.
A Final Word
As we enter 2024, these three trends -- RAG-based LLMs, verticalization within data infrastructure, and the consolidation of data operations -- present both opportunities and challenges for data and analytics professionals. Staying informed, adaptable, and ready to embrace these shifts will be the key to thriving in an ever-evolving data landscape.
Kyle Kirwan is the co-founder and CEO of Bigeye. In his career, Kirwan was one of the first analysts at Uber. There, he launched the company's data catalog, Databook, as well as other tools used by thousands of their internal data users. He then went on to co-found Bigeye, a Sequoia-backed startup that works on data observability. You can reach Kyle on Twitter or LinkedIn.