TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Think
- Research & Resources
  - TDWI Playbook | Next Generation Data Science: The AI-Driven Data Science Life Cycle
  - TDWI Data Points | The Data Foundation for AI
  - TDWI Best Practices Report | Data Strategies and Foundations for Modern Data Management
  - TDWI Insight Accelerator | Adopting a Platform Approach for Gaining Insights from Unstructured Data
- Webinars
  - Expert Panel: What's Next in Data Integration: Powering the AI-Driven Enterprise August 25, 2025
  - Expert Panel: Improving Data Quality, Accuracy, and Consistency August 27, 2025
  - The State of Self-Service Analytics: Results from TDWI’s Latest Research September 8, 2025
  - Expert Panel: Building an AI-Driven Data Strategy September 15, 2025
- Virtual Summits
  - Virtual Events Keys to Making Your Data AI Ready September 10, 2025
  - Virtual Events Data Quality for BI, Analytics and AI October 22, 2025
  - Virtual Events Modern Data Strategy November 12, 2025
  - Virtual Events What’s Ahead in 2026 for Data & Analytics December 10, 2025
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Speaking of Data Podcast
  
  Current Research Surveys
Train
- In-Person Events
  - Conference TDWI Transform 2025 San Diego August 18, 2025
  - Executive Summit TDWI Modern Data Leader's Summit San Diego: AI in the Enterprise August 18, 2025
  - Conference TDWI Transform 2025 Orlando November 16, 2025
  - Executive Summit TDWI Data & AI Leaders Summit Orlando: Governing Data, Analytics, and AI November 17, 2025
- Virtual Live Seminars
  - Data Governance Week July 30, 2025
  - Platforms & Architecture Week July 30, 2025
  - AI Bootcamp Week July 30, 2025
- Online Learning
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Train Your TeamCustom solutions for training your team
  
  Get CertifiedEarn a professional credential in BI and Analytics, Data Governance, or AI
  
  TDWI MembershipExclusive access to the research, tools, training, and connections
Engage
- Connect
  - Connect and Contribute to Our Vibrant Community of Data Leaders
    
    Subscribe to TDWI Stay up to date on the latest news and events. Sign Up
    
    Become a TDWI Member Gain exclusive access to the research, tools, training, and connections to move your careers, teams, and projects forward. Learn More
    
    Become a Part of the TDWI Research Panel Make a difference in the data and analytics industry and earn incentives by sharing your insights with TDWI. Explore Now
    
    Speak at TDWI Events Share your expertise and build your personal brand as a speaker at a TDWI In-Person or Virtual Event. Submit a Proposal
    
    Become a TDWI Research Fellow Apply to be a member of TDWI’s industry leading research team. Apply Today
    
    Become a Member of the Data & AI Leaders Forum Engage in collaborative discussions, stay ahead of the curve, and stay in the know. Apply Now
    
    Showcase Your Data & AI Solutions Reach and engage with TDWI community through multi-channel marketing programs. Learn More

TDWI Articles

How RAG Will Usher In the Next Generation of LLMs and Generative AI

Retrieval-augmented generation may provide a big step forward in addressing many of the issues that keep enterprises from adopting AI.

By Andy Xu
March 27, 2024

If you’ve been keeping up with the steady stream of developments in artificial intelligence -- particularly with generative AI -- you’ll have noticed that it usually involves a veritable alphabet soup of abbreviations. The latest of these is RAG, or retrieval-augmented generation. However, this is not just another jumble of letters -- it may be a big step forward in addressing many of the lingering issues facing AI adoption for business.

What is RAG?

For Further Reading:

Organizations Must Be Prudent To Realize Value In Generative AI

Generative AI and Its Implications for Data and Analytics

Executive Q&A: How Generative AI Is Changing How We Think About Analytics

RAG is an emerging AI technique designed to improve the output of large language models (LLMs) by accessing and incorporating information outside their training data sets before generating a response. RAG is an important tool to help combat the nagging issue of hallucinations, as well as enhancing data security and privacy.

A typical AI request (called an inference) involves six basic steps:

1. Input data preparation. This could involve normalization, tokenization (for text), resizing images, or converting the data into a specific format.

2. Model loading. This model has already been trained on a data set and has learned patterns that it can apply to new data.

3. Inference execution. The prepared input data is fed into the model.

4. Output generation. The nature of this output depends on the task.

5. Post-processing. The raw output from the model may undergo post-processing to convert it into a more interpretable or useful form.

6. Result interpretation and action. Finally, the post-processed output is interpreted within the context of the application, leading to an action or decision. For example, in a medical diagnosis application, the output might be interpreted by a healthcare professional to inform a treatment plan.

In a RAG-augmented inference, RAG most affects steps 3 and 4. For example, in step 3, the application also searches whatever external data it’s been given access to (internal company databases, external documents, etc.) in addition to the training data the model was built on. Then, in step 4, RAG picks the top-matched documents from the retrieval step and uses the LLM to generate the response depending on the specific use case (i.e., question answering, summarization, etc.).

To optimize inference performance, RAG often includes an offline process that builds embeddings for all external documents, indexes them, and stores them. A popular architectural choice is to use a vector database for indexing, storage, and retrieval.

RAG’s Pros and Cons

RAG provides many advantages over pre-trained or fine-tuned LLMs. It allows models to access up-to-date external data, mitigating the limitations that LLM training data sets have had. It gives models context to keep up with the latest data without constant retraining.

RAG also helps reduce LLM hallucinations by providing relevant and accurate data sources as context for generation. What’s more, RAG enhances data security and privacy by granting enterprises to allow their applications to access sensitive data while still keeping it separate with added-on protection.

Depending on the use case, RAG often provides a more cost-efficient solution. New data can be embedded and added to the vector database, giving the application access to the latest information without having to continually retrain the LLM. At inference time, this additional context can be retrieved by querying documents most relevant to the input query. That also reduces the need to have a very large token context window, as is the case with models such as Gemini 1.5 Pro.

Meanwhile, introducing RAG to the mix brings several challenges. For example, RAG presents new architectural requirements -- a vector database is usually needed to perform the indexing, storage, and retrieval functions, making design and implementation of a RAG system more complicated.

RAG also requires additional steps to generate embedding of queries and to retrieve similar documents. These increase the inference latency compared with inference on pre-trained or fine-tuned LLMs.

Of course, as with all things AI, the quality of the RAG result depends on the quality of the data sources. When there are data quality issues in the external data sources, the RAG application’s quality will be negatively affected.

Leveraging RAG in Business

For Further Reading:

Organizations Must Be Prudent To Realize Value In Generative AI

Generative AI and Its Implications for Data and Analytics

Executive Q&A: How Generative AI Is Changing How We Think About Analytics

RAG’s ability to access external data sources during inference time makes it an excellent fit for many applications. Common use cases include:

Enterprise internal knowledge bases. RAG enables enterprises to keep their sensitive internal data safe and alleviates the need to keep LLMs updated. Furthermore, it helps provide accurate answers by minimizing hallucinations.
Customer service assistants. RAG allows LLMs to use customer-specific information as context to provide personalized answers. It can also enable enterprises to avoid using sensitive customer data to train their LLMs.
Domain-specific research tool. Domain-specific knowledge from external data sources can be supplemented with LLMs using RAG, providing an alternative to training a domain-specific LLM.

A Case in Point

One example illustrates how valuable RAG can be in practice. A top publisher wanted to make use of its immense archive of valuable content, but was faced with the daunting task of efficiently researching relevant materials to provide suggestions and insights to their writers and editors -- a virtually impossible process without advanced AI technology.

Powered by the GPT and BERT models in a model library, the customer quickly built a powerful RAG application on their existing AI platform. This solution automatically sifts through their extensive content repository, identifies pertinent information, and makes timely, AI-driven recommendations to their editorial team.

The introduction of the RAG application dramatically improved the efficiency and depth of their reporting, allowing the customer to deliver richer, more insightful narratives and to maintain their position as a leader in their field with cutting-edge, data-backed storytelling.

The Retrieval-Augmented Future

Once companies become familiar with RAG, they can combine a variety of off-the-shelf or custom LLMs with internal or external knowledge bases to create a wide range of assistants that help their employees and customers. Chatbots and other conversational systems that use natural language processing can benefit significantly from RAG and generative AI. For example, a generative AI model supplemented with a medical database could be a great assistant for doctors and nurses.

In the future, RAG technology may help generative AI take appropriate action based on contextual information and user prompts. For example, a RAG-augmented AI system might identify the highest-rated vacation rental in Kihei and then initiate booking a two-bedroom beach house within walking distance of your favorite snorkeling spot.

Although RAG might add some complexity to your enterprise AI undertakings, RAG is worth the effort for many use cases. Retrieval-augmented generation builds on the benefits of LLMs by making them more timely, accurate, secure, and contextual. For business applications of generative AI, RAG is an important capability to understand and incorporate within your AI applications.

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.

↑

TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

How RAG Will Usher In the Next Generation of LLMs and Generative AI

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI

Engage

Research

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

How RAG Will Usher In the Next Generation of LLMs and Generative AI

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects, and Your Career

TDWI

Engage

Research

Accelerate Your Projects,
and Your Career