Skip to main content

时间加权检索器

¥Time-Weighted Retriever

时间加权检索器是一种除了相似性之外还考虑新近度的检索器。评分算法如下:

¥A Time-Weighted Retriever is a retriever that takes into account recency in addition to similarity. The scoring algorithm is:

let score = (1.0 - this.decayRate) ** hoursPassed + vectorRelevance;

需要注意的是,上面的 hoursPassed 指的是自检索器中对象上次访问以来的时间,而不是自创建以来的时间。这意味着频繁访问的对象仍为 "fresh" 并获得更高的分数。

¥Notably, hoursPassed above refers to the time since the object in the retriever was last accessed, not since it was created. This means that frequently accessed objects remain "fresh" and score higher.

this.decayRate 是一个可配置的 0 到 1 之间的十进制数。较低的数字意味着文档的 "remembered" 时间更长,而较高的数字则对最近访问的文档赋予更高的权重。

¥this.decayRate is a configurable decimal number between 0 and 1. A lower number means that documents will be "remembered" for longer, while a higher number strongly weights more recently accessed documents.

请注意,将衰减率设置为 0 或 1 将使 hoursPassed 变得无关紧要,并使此检索器相当于标准向量查找。

¥Note that setting a decay rate of exactly 0 or 1 makes hoursPassed irrelevant and makes this retriever equivalent to a standard vector lookup.

用法

¥Usage

本示例展示如何使用向量存储初始化 TimeWeightedVectorStoreRetriever。需要注意的是,由于必需的元数据,所有文档都必须使用检索器上的 addDocuments 方法添加到支持向量存储中,而不是向量存储本身。

¥This example shows how to intialize a TimeWeightedVectorStoreRetriever with a vector store. It is important to note that due to required metadata, all documents must be added to the backing vector store using the addDocuments method on the retriever, not the vector store itself.

npm install @langchain/openai @langchain/core
import { TimeWeightedVectorStoreRetriever } from "langchain/retrievers/time_weighted";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings } from "@langchain/openai";

const vectorStore = new MemoryVectorStore(new OpenAIEmbeddings());

const retriever = new TimeWeightedVectorStoreRetriever({
vectorStore,
memoryStream: [],
searchKwargs: 2,
});

const documents = [
"My name is John.",
"My name is Bob.",
"My favourite food is pizza.",
"My favourite food is pasta.",
"My favourite food is sushi.",
].map((pageContent) => ({ pageContent, metadata: {} }));

// All documents must be added using this method on the retriever (not the vector store!)
// so that the correct access history metadata is populated
await retriever.addDocuments(documents);

const results1 = await retriever.invoke("What is my favourite food?");

console.log(results1);

/*
[
Document { pageContent: 'My favourite food is pasta.', metadata: {} }
]
*/

const results2 = await retriever.invoke("What is my favourite food?");

console.log(results2);

/*
[
Document { pageContent: 'My favourite food is pasta.', metadata: {} }
]
*/

API Reference:

¥Related