Definition
Retrieval-Augmented Generation (RAG) is a technique that combines generative models with information retrieval systems. It grounds outputs in external sources to improve factual accuracy.
Purpose
The purpose is to reduce hallucinations in generative AI by augmenting responses with retrieved documents. It is especially useful in question answering and knowledge-intensive tasks.
Importance
- Improves factual accuracy in LLM outputs.
- Enables domain-specific knowledge integration.
- Requires reliable retrieval systems.
- Related to hybrid search and open-domain QA.
How It Works
- User provides a query or prompt.
- Retrieval system fetches relevant documents.
- Documents are passed into a generative model.
- Model generates responses grounded in retrieved content.
- Feedback loops improve future performance.
Examples (Real World)
- OpenAI ChatGPT with browsing or retrieval plugins.
- Meta RAG model: research on retrieval-grounded LLMs.
- Perplexity AI: retrieval-augmented conversational search.
References / Further Reading
- Lewis et al. “Retrieval-Augmented Generation for Knowledge-Intensive NLP.” NeurIPS 2020.
- Hugging Face RAG Implementation.
- Stanford HAI Research on Retrieval Methods.