时间加权检索器
¥Time-Weighted Retriever
时间加权检索器是一种除了相似性之外还考虑新近度的检索器。评分算法如下:
¥A Time-Weighted Retriever is a retriever that takes into account recency in addition to similarity. The scoring algorithm is:
let score = (1.0 - this.decayRate) ** hoursPassed + vectorRelevance;
需要注意的是,上面的 hoursPassed
指的是自检索器中对象上次访问以来的时间,而不是自创建以来的时间。这意味着频繁访问的对象仍为 "fresh" 并获得更高的分数。
¥Notably, hoursPassed
above refers to the time since the object in the retriever was last accessed, not since it was created. This means that frequently accessed objects remain "fresh" and score higher.
this.decayRate
是一个可配置的 0 到 1 之间的十进制数。较低的数字意味着文档的 "remembered" 时间更长,而较高的数字则对最近访问的文档赋予更高的权重。
¥this.decayRate
is a configurable decimal number between 0 and 1. A lower number means that documents will be "remembered" for longer, while a higher number strongly weights more recently accessed documents.
请注意,将衰减率设置为 0 或 1 将使 hoursPassed
变得无关紧要,并使此检索器相当于标准向量查找。
¥Note that setting a decay rate of exactly 0 or 1 makes hoursPassed
irrelevant and makes this retriever equivalent to a standard vector lookup.
用法
¥Usage
本示例展示如何使用向量存储初始化 TimeWeightedVectorStoreRetriever
。需要注意的是,由于必需的元数据,所有文档都必须使用检索器上的 addDocuments
方法添加到支持向量存储中,而不是向量存储本身。
¥This example shows how to intialize a TimeWeightedVectorStoreRetriever
with a vector store.
It is important to note that due to required metadata, all documents must be added to the backing vector store using the addDocuments
method on the retriever, not the vector store itself.
- npm
- Yarn
- pnpm
npm install @langchain/openai @langchain/core
yarn add @langchain/openai @langchain/core
pnpm add @langchain/openai @langchain/core
import { TimeWeightedVectorStoreRetriever } from "langchain/retrievers/time_weighted";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings } from "@langchain/openai";
const vectorStore = new MemoryVectorStore(new OpenAIEmbeddings());
const retriever = new TimeWeightedVectorStoreRetriever({
vectorStore,
memoryStream: [],
searchKwargs: 2,
});
const documents = [
"My name is John.",
"My name is Bob.",
"My favourite food is pizza.",
"My favourite food is pasta.",
"My favourite food is sushi.",
].map((pageContent) => ({ pageContent, metadata: {} }));
// All documents must be added using this method on the retriever (not the vector store!)
// so that the correct access history metadata is populated
await retriever.addDocuments(documents);
const results1 = await retriever.invoke("What is my favourite food?");
console.log(results1);
/*
[
Document { pageContent: 'My favourite food is pasta.', metadata: {} }
]
*/
const results2 = await retriever.invoke("What is my favourite food?");
console.log(results2);
/*
[
Document { pageContent: 'My favourite food is pasta.', metadata: {} }
]
*/
API Reference:
- TimeWeightedVectorStoreRetriever from
langchain/retrievers/time_weighted
- MemoryVectorStore from
langchain/vectorstores/memory
- OpenAIEmbeddings from
@langchain/openai
相关
¥Related
检索器 概念指南
¥Retriever conceptual guide
检索器 操作指南
¥Retriever how-to guides