检索增强生成(rag)

¥Retrieval augmented generation (rag)

[Prerequisites]

检索
¥Retrieval

概述

¥Overview

检索增强生成 (RAG) 是一种强大的技术，它通过将语言模型与外部知识库相结合来增强其性能。RAG 地址模型的一个关键限制：模型依赖于固定的训练数据集，这可能会导致信息过时或不完整。当给定查询时，RAG 系统首先在知识库中搜索相关信息。然后，系统将检索到的信息整合到模型的提示中。该模型使用提供的上下文生成对查询的响应。通过弥合庞大的语言模型与动态、有针对性的信息检索之间的差距，RAG 是一种构建更强大、更可靠的 AI 系统的强大技术。

¥Retrieval Augmented Generation (RAG) is a powerful technique that enhances language models by combining them with external knowledge bases. RAG addresses a key limitation of models: models rely on fixed training datasets, which can lead to outdated or incomplete information. When given a query, RAG systems first search a knowledge base for relevant information. The system then incorporates this retrieved information into the model's prompt. The model uses the provided context to generate a response to the query. By bridging the gap between vast language models and dynamic, targeted information retrieval, RAG is a powerful technique for building more capable and reliable AI systems.

关键概念

¥Key concepts

Conceptual Overview

(1)检索系统：从知识库中检索相关信息。

¥(1) Retrieval system: Retrieve relevant information from a knowledge base.

(2)添加外部知识：将检索到的信息传递给模型。

¥(2) Adding external knowledge: Pass retrieved information to a model.

检索系统

¥Retrieval system

模型拥有内部知识，这些知识通常是固定的，或者至少由于训练成本高昂而不经常更新。这限制了他们回答有关时事的问题或提供特定字段知识的能力。为了解决这个问题，有各种知识注入技术，例如 fine-tuning 或持续预训练。两者都是 costly，通常也是不太适用，用于事实检索。使用检索系统有几个优点：

¥Model's have internal knowledge that is often fixed, or at least not updated frequently due to the high cost of training. This limits their ability to answer questions about current events, or to provide specific domain knowledge. To address this, there are various knowledge injection techniques like fine-tuning or continued pre-training. Both are costly and often poorly suited for factual retrieval. Using a retrieval system offers several advantages:

最新信息：RAG 可以访问和利用最新数据，保持响应的时效性。
¥Up-to-date information: RAG can access and utilize the latest data, keeping responses current.
字段特定专业知识：借助字段特定知识库，RAG 可以提供特定字段的答案。
¥Domain-specific expertise: With domain-specific knowledge bases, RAG can provide answers in specific domains.
减少幻觉：将响应与检索到的事实联系起来，有助于最大限度地减少虚假或虚构的信息。
¥Reduced hallucination: Grounding responses in retrieved facts helps minimize false or invented information.
经济高效的知识集成：RAG 提供了一种比昂贵的模型微调更高效的替代方案。
¥Cost-effective knowledge integration: RAG offers a more efficient alternative to expensive model fine-tuning.

[Further reading]

请参阅我们关于 retrieval 的概念指南。

¥See our conceptual guide on retrieval.

添加外部知识

¥Adding external knowledge

有了检索系统，我们需要将知识从该系统传递给模型。RAG 管道通常按照以下步骤实现此目的：

¥With a retrieval system in place, we need to pass knowledge from this system to the model. A RAG pipeline typically achieves this following these steps:

接收输入查询。
¥Receive an input query.
使用检索系统根据查询搜索相关信息。
¥Use the retrieval system to search for relevant information based on the query.
将检索到的信息合并到发送给 LLM 的提示中。
¥Incorporate the retrieved information into the prompt sent to the LLM.
生成利用检索到的上下文的响应。
¥Generate a response that leverages the retrieved context.

例如，这里有一个简单的 RAG 工作流，它将信息从 retriever 传递到聊天模型：

¥As an example, here's a simple RAG workflow that passes information from a retriever to a chat model:

import { ChatOpenAI } from "@langchain/openai";

// Define a system prompt that tells the model how to use the retrieved context
const systemPrompt = `You are an assistant for question-answering tasks.
Use the following pieces of retrieved context to answer the question.
If you don't know the answer, just say that you don't know.
Use three sentences maximum and keep the answer concise.
Context: {context}:`;

// Define a question
const question =
  "What are the main components of an LLM-powered autonomous agent system?";

// Retrieve relevant documents
const docs = await retriever.invoke(question);

// Combine the documents into a single string
const docsText = docs.map((d) => d.pageContent).join("");

// Populate the system prompt with the retrieved context
const systemPromptFmt = systemPrompt.replace("{context}", docsText);

// Create a model
const model = new ChatOpenAI({
  model: "gpt-4o",
  temperature: 0,
});

// Generate a response
const questions = await model.invoke([
  {
    role: "system",
    content: systemPromptFmt,
  },
  {
    role: "user",
    content: question,
  },
]);

[Further reading]

RAG 是一个深度字段，拥有众多可能的优化和设计选择：

¥RAG a deep area with many possible optimization and design choices:

有关 RAG 的全面概述和历史，请参阅 Cameron Wolfe 的此优秀博客。
¥See this excellent blog from Cameron Wolfe for a comprehensive overview and history of RAG.
请参阅我们的 RAG 使用指南。
¥See our RAG how-to guides.
请参阅我们的 RAG tutorials。
¥See our RAG tutorials.
查看我们的 Scratch 版 RAG 课程，其中包含 code 和视频播放列表。
¥See our RAG from Scratch course, with code and video playlist.
另请参阅我们的 RAG from Scratch 课程在 Freecodecamp 上。
¥Also, see our RAG from Scratch course on Freecodecamp.

检索增强生成(rag)

概述​

关键概念​

检索系统​

添加外部知识​

概述

关键概念

检索系统

添加外部知识