回退
¥Fallbacks
本指南假设你熟悉以下概念:
¥This guide assumes familiarity with the following concepts:
使用语言模型时,你可能会遇到来自底层 API 的问题,例如速率限制或停机时间。随着 LLM 应用投入生产,制定应对错误的应急计划变得越来越重要。因此,我们引入了 fallback 的概念。
¥When working with language models, you may encounter issues from the underlying APIs, e.g. rate limits or downtime. As you move your LLM applications into production it becomes more and more important to have contingencies for errors. That's why we've introduced the concept of fallbacks.
至关重要的是,回退不仅可以应用于 LLM 级别,还可以应用于整个可运行级别。这很重要,因为不同的模型通常需要不同的提示。因此,如果你对 OpenAI 的调用失败,你不应该只向 Anthropic 发送相同的提示。 - 你可能想要使用其他提示模板。
¥Crucially, fallbacks can be applied not only on the LLM level but on the whole runnable level. This is important because often times different models require different prompts. So if your call to OpenAI fails, you don't just want to send the same prompt to Anthropic - you probably want want to use e.g. a different prompt template.
处理 LLM API 错误
¥Handling LLM API errors
这可能是回退的最常见用例。对 LLM API 的请求可能由于多种原因失败。 - API 可能已关闭,你可能达到了速率限制,或者其他任何原因。
¥This is maybe the most common use case for fallbacks. A request to an LLM API can fail for a variety of reasons - the API could be down, you could have hit a rate limit, or any number of things.
重要:默认情况下,LangChain 的许多 LLM 封装器会捕获错误并重试。在使用回退功能时,你很可能希望关闭这些文件。否则,第一个封装器将继续重试而不是失败。
¥IMPORTANT: By default, many of LangChain's LLM wrappers catch errors and retry. You will most likely want to turn those off when working with fallbacks. Otherwise the first wrapper will keep on retrying rather than failing.
- npm
- Yarn
- pnpm
npm install @langchain/anthropic @langchain/openai @langchain/core
yarn add @langchain/anthropic @langchain/openai @langchain/core
pnpm add @langchain/anthropic @langchain/openai @langchain/core
import { ChatOpenAI } from "@langchain/openai";
import { ChatAnthropic } from "@langchain/anthropic";
// Use a fake model name that will always throw an error
const fakeOpenAIModel = new ChatOpenAI({
model: "potato!",
maxRetries: 0,
});
const anthropicModel = new ChatAnthropic({});
const modelWithFallback = fakeOpenAIModel.withFallbacks([anthropicModel]);
const result = await modelWithFallback.invoke("What is your name?");
console.log(result);
/*
AIMessage {
content: ' My name is Claude. I was created by Anthropic.',
additional_kwargs: {}
}
*/
API Reference:
- ChatOpenAI from
@langchain/openai
- ChatAnthropic from
@langchain/anthropic
RunnableSequences 的回退
¥Fallbacks for RunnableSequences
我们还可以为序列创建回退,这些回退本身就是序列。这里我们用两个不同的模型来实现这一点:ChatOpenAI 以及普通的 OpenAI(不使用聊天模型)。由于 OpenAI 并非聊天模型,你可能需要不同的提示。
¥We can also create fallbacks for sequences, that are sequences themselves. Here we do that with two different models: ChatOpenAI and then normal OpenAI (which does not use a chat model). Because OpenAI is NOT a chat model, you likely want a different prompt.
import { ChatOpenAI, OpenAI } from "@langchain/openai";
import { StringOutputParser } from "@langchain/core/output_parsers";
import { ChatPromptTemplate, PromptTemplate } from "@langchain/core/prompts";
const chatPrompt = ChatPromptTemplate.fromMessages<{ animal: string }>([
[
"system",
"You're a nice assistant who always includes a compliment in your response",
],
["human", "Why did the {animal} cross the road?"],
]);
// Use a fake model name that will always throw an error
const fakeOpenAIChatModel = new ChatOpenAI({
model: "potato!",
maxRetries: 0,
});
const prompt =
PromptTemplate.fromTemplate(`Instructions: You should always include a compliment in your response.
Question: Why did the {animal} cross the road?
Answer:`);
const openAILLM = new OpenAI({});
const outputParser = new StringOutputParser();
const badChain = chatPrompt.pipe(fakeOpenAIChatModel).pipe(outputParser);
const goodChain = prompt.pipe(openAILLM).pipe(outputParser);
const chain = badChain.withFallbacks([goodChain]);
const result = await chain.invoke({
animal: "dragon",
});
console.log(result);
/*
I don't know, but I'm sure it was an impressive sight. You must have a great imagination to come up with such an interesting question!
*/
API Reference:
- ChatOpenAI from
@langchain/openai
- OpenAI from
@langchain/openai
- StringOutputParser from
@langchain/core/output_parsers
- ChatPromptTemplate from
@langchain/core/prompts
- PromptTemplate from
@langchain/core/prompts
处理长输入
¥Handling long inputs
LLM 在其上下文窗口中的一大限制因素。有时,你可以在将提示发送到 LLM 之前对其进行计数和跟踪,但在这很困难/复杂的情况下,你可以回退到上下文长度更长的模型。
¥One of the big limiting factors of LLMs in their context window. Sometimes you can count and track the length of prompts before sending them to an LLM, but in situations where that is hard/complicated you can fallback to a model with longer context length.
import { ChatOpenAI } from "@langchain/openai";
// Use a model with a shorter context window
const shorterLlm = new ChatOpenAI({
model: "gpt-3.5-turbo",
maxRetries: 0,
});
const longerLlm = new ChatOpenAI({
model: "gpt-3.5-turbo-16k",
});
const modelWithFallback = shorterLlm.withFallbacks([longerLlm]);
const input = `What is the next number: ${"one, two, ".repeat(3000)}`;
try {
await shorterLlm.invoke(input);
} catch (e) {
// Length error
console.log(e);
}
const result = await modelWithFallback.invoke(input);
console.log(result);
/*
AIMessage {
content: 'The next number is one.',
name: undefined,
additional_kwargs: { function_call: undefined }
}
*/
API Reference:
- ChatOpenAI from
@langchain/openai
回退到更好的模型
¥Fallback to a better model
我们经常要求模型以特定格式(例如 JSON)输出格式。像 GPT-3.5 这样的模型可以做到这一点,但有时也会遇到困难。这自然指向了后备方案。 - 我们可以尝试更快、更便宜的模型,但如果解析失败,我们可以使用 GPT-4。
¥Often times we ask models to output format in a specific format (like JSON). Models like GPT-3.5 can do this okay, but sometimes struggle. This naturally points to fallbacks - we can try with a faster and cheaper model, but then if parsing fails we can use GPT-4.
import { z } from "zod";
import { OpenAI, ChatOpenAI } from "@langchain/openai";
import { PromptTemplate } from "@langchain/core/prompts";
import { StructuredOutputParser } from "@langchain/core/output_parsers";
const prompt = PromptTemplate.fromTemplate(
`Return a JSON object containing the following value wrapped in an "input" key. Do not return anything else:\n{input}`
);
const badModel = new OpenAI({
maxRetries: 0,
model: "gpt-3.5-turbo-instruct",
});
const normalModel = new ChatOpenAI({
model: "gpt-4",
});
const outputParser = StructuredOutputParser.fromZodSchema(
z.object({
input: z.string(),
})
);
const badChain = prompt.pipe(badModel).pipe(outputParser);
const goodChain = prompt.pipe(normalModel).pipe(outputParser);
try {
const result = await badChain.invoke({
input: "testing0",
});
} catch (e) {
console.log(e);
/*
OutputParserException [Error]: Failed to parse. Text: "
{ "name" : " Testing0 ", "lastname" : " testing ", "fullname" : " testing ", "role" : " test ", "telephone" : "+1-555-555-555 ", "email" : " testing@gmail.com ", "role" : " test ", "text" : " testing0 is different than testing ", "role" : " test ", "immediate_affected_version" : " 0.0.1 ", "immediate_version" : " 1.0.0 ", "leading_version" : " 1.0.0 ", "version" : " 1.0.0 ", "finger prick" : " no ", "finger prick" : " s ", "text" : " testing0 is different than testing ", "role" : " test ", "immediate_affected_version" : " 0.0.1 ", "immediate_version" : " 1.0.0 ", "leading_version" : " 1.0.0 ", "version" : " 1.0.0 ", "finger prick" :". Error: SyntaxError: Unexpected end of JSON input
*/
}
const chain = badChain.withFallbacks([goodChain]);
const result = await chain.invoke({
input: "testing",
});
console.log(result);
/*
{ input: 'testing' }
*/
API Reference:
- OpenAI from
@langchain/openai
- ChatOpenAI from
@langchain/openai
- PromptTemplate from
@langchain/core/prompts
- StructuredOutputParser from
@langchain/core/output_parsers