RAG (Retrieval Augmented Generation)

Technology

An AI technique that enhances language models by retrieving relevant information from an external knowledge base to generate more accurate and context-aware responses. It is used by Glue.


First Mentioned

10/12/2025, 6:12:44 AM

Last Updated

10/12/2025, 6:21:25 AM

Research Retrieved

10/12/2025, 6:21:25 AM

Summary

Retrieval-Augmented Generation (RAG) is a technology designed to enhance large language models (LLMs) by allowing them to access and integrate external, authoritative information before generating responses. Introduced in a 2020 research paper by Meta, RAG operates by retrieving relevant text from various sources such as databases, uploaded documents, or the web, which then augments the LLM's pre-existing training data. This process significantly improves the accuracy and timeliness of LLM responses, particularly for domain-specific queries, and helps mitigate AI hallucinations by grounding outputs in factual data. RAG also offers benefits like reduced computational and financial costs associated with frequent LLM retraining, and increased transparency by enabling LLMs to cite their sources. A practical application of RAG is seen in AI-native enterprise chat platforms like Glue, which leverages RAG technology from a service called Raggi to search internal documents and formulate responses.

Referenced in 1 Document
Research Data
Extracted Attributes
  • Type

    Technology

  • Origin

    Research paper from Meta

  • Full Name

    Retrieval-Augmented Generation

  • Mechanism

    Retrieves relevant information from external data sources (databases, documents, web) and provides it to the LLM alongside the user query.

  • Key Benefit

    Increases transparency by allowing LLMs to cite sources.

  • Primary Purpose

    Optimizing the output of large language models (LLMs) by referencing external knowledge bases.

Timeline
  • RAG (Retrieval-Augmented Generation) was first introduced in a research paper from Meta. (Source: Summary)

    2020

Retrieval-augmented generation

Retrieval-augmented generation (RAG) is a technique that enables large language models (LLMs) to retrieve and incorporate new information. With RAG, LLMs do not respond to user queries until they refer to a specified set of documents. These documents supplement information from the LLM's pre-existing training data. This allows LLMs to use domain-specific and/or updated information that is not available in the training data. For example, this helps LLM-based chatbots access internal company data or generate responses based on authoritative sources. RAG improves large language models (LLMs) by incorporating information retrieval before generating responses. Unlike traditional LLMs that rely on static training data, RAG pulls relevant text from databases, uploaded documents, or web sources. According to Ars Technica, "RAG is a way of improving LLM performance, in essence by blending the LLM process with a web search or other document look-up process to help LLMs stick to the facts." This method helps reduce AI hallucinations, which have caused chatbots to describe policies that don't exist, or recommend nonexistent legal cases to lawyers that are looking for citations to support their arguments. RAG also reduces the need to retrain LLMs with new data, saving on computational and financial costs. Beyond efficiency gains, RAG also allows LLMs to include sources in their responses, so users can verify the cited sources. This provides greater transparency, as users can cross-check retrieved content to ensure accuracy and relevance. The term RAG was first introduced in a 2020 research paper from Meta.

Web Search Results
  • What is RAG? - Retrieval-Augmented Generation AI Explained - AWS

    Retrieval-Augmented Generation (RAG) is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response. Large Language Models (LLMs) are trained on vast volumes of data and use billions of parameters to generate original output for tasks like answering questions, translating languages, and completing sentences. RAG extends the already powerful capabilities of LLMs to specific [...] RAG is one approach to solving some of these challenges. It redirects the LLM to retrieve relevant information from authoritative, pre-determined knowledge sources. Organizations have greater control over the generated text output, and users gain insights into how the LLM generates the response. What are the benefits of Retrieval-Augmented Generation? RAG technology brings several benefits to an organization's generative AI efforts. ### Cost-effective implementation [...] Without RAG, the LLM takes the user input and creates a response based on information it was trained on—or what it already knows. With RAG, an information retrieval component is introduced that utilizes the user input to first pull information from a new data source. The user query and the relevant information are both given to the LLM. The LLM uses the new knowledge and its training data to create better responses. The following sections provide an overview of the process.

  • Retrieval-augmented generation - Wikipedia

    Retrieval-augmented generation (RAG) is a technique that enables large language models (LLMs) to retrieve and incorporate new information.( With RAG, LLMs do not respond to user queries until they refer to a specified set of documents. These documents supplement information from the LLM's pre-existing training data.( This allows LLMs to use domain-specific and/or updated information that is not available in the training data.( For example, this helps LLM-based chatbots access internal company [...] Retrieval-augmented generation (RAG) enhances large language models (LLMs) by incorporating an information-retrieval mechanism that allows models to access and utilize additional data beyond their original training set. AWS states, "RAG allows LLMs to retrieve relevant information from external data sources to generate more accurate and contextually relevant responses" ("indexing").( This approach reduces reliance on static datasets, which can quickly become outdated. When a user submits a [...] RAG improves large language models (LLMs) by incorporating information retrieval before generating responses.( Unlike traditional LLMs that rely on static training data, RAG pulls relevant text from databases, uploaded documents, or web sources.( According to _Ars Technica_, "RAG is a way of improving LLM performance, in essence by blending the LLM process with a web search or other document look-up process to help LLMs stick to the facts." This method helps reduce AI hallucinations,( which

  • What is Retrieval-Augmented Generation (RAG)? - Google Cloud

    RAG (Retrieval-Augmented Generation) is an AI framework that combines the strengths of traditional information retrieval systems (such as search and databases) with the capabilities of generative large language models (LLMs). By combining your data and world knowledge with LLM language skills, grounded generation is more accurate, up-to-date, and relevant to your specific needs. Check out this ebook to unlock your “Enterprise Truth.” [...] By fine-tuning or prompt-engineering the LLM to generate text entirely based on the retrieved knowledge, RAG helps to minimize contradictions and inconsistencies in the generated text.This significantly improves the quality of the generated text, and improves the user experience.

  • What Is Retrieval-Augmented Generation aka RAG | NVIDIA Blogs

    Skip to content Toggle Search # What Is Retrieval-Augmented Generation, aka RAG? Retrieval-augmented generation is a technique for enhancing the accuracy and reliability of generative AI models with information from specific and relevant data sources. January 31, 2025 by Rick Merritt Email0 Editor’s note: This article, originally published on Nov. 15, 2023, has been updated. To understand the latest advancements in generative AI, imagine a courtroom. [...] “We always planned to have a nicer sounding name, but when it came time to write the paper, no one had a better idea,” said Lewis, who now leads a RAG team at AI startup Cohere. ## So, What Is Retrieval-Augmented Generation (RAG)? Retrieval-augmented generation is a technique for enhancing the accuracy and reliability of generative AI models with information fetched from specific and relevant data sources. [...] With retrieval-augmented generation, users can essentially have conversations with data repositories, opening up new kinds of experiences. This means the applications for RAG could be multiple times the number of available datasets. For example, a generative AI model supplemented with a medical index could be a great assistant for a doctor or nurse. Financial analysts would benefit from an assistant linked to market data.

  • RAG and generative AI - Azure AI Search - Microsoft Learn

    Access to this page requires authorization. You can try changing directories. # Retrieval Augmented Generation (RAG) in Azure AI Search 2025-08-18 Retrieval Augmented Generation (RAG) is a design pattern that augments the capabilities of a chat completion model like ChatGPT by adding an information retrieval step, incorporating your proprietary enterprise content for answer formulation. For an enterprise solution, it's possible to fully constrain generative AI to your enterprise content.