Typesense

使用 Typesense 搜索引擎的向量存储。

¥Vector store that utilizes the Typesense search engine.

基本用法

¥Basic Usage

tip

¥See this section for general instructions on installing integration packages.

npm
Yarn
pnpm

npm install @langchain/openai @langchain/community @langchain/core

yarn add @langchain/openai @langchain/community @langchain/core

pnpm add @langchain/openai @langchain/community @langchain/core

import {
  Typesense,
  TypesenseConfig,
} from "@lanchain/community/vectorstores/typesense";
import { OpenAIEmbeddings } from "@langchain/openai";
import { Client } from "typesense";
import { Document } from "@langchain/core/documents";

const vectorTypesenseClient = new Client({
  nodes: [
    {
      // Ideally should come from your .env file
      host: "...",
      port: 123,
      protocol: "https",
    },
  ],
  // Ideally should come from your .env file
  apiKey: "...",
  numRetries: 3,
  connectionTimeoutSeconds: 60,
});

const typesenseVectorStoreConfig = {
  // Typesense client
  typesenseClient: vectorTypesenseClient,
  // Name of the collection to store the vectors in
  schemaName: "your_schema_name",
  // Optional column names to be used in Typesense
  columnNames: {
    // "vec" is the default name for the vector column in Typesense but you can change it to whatever you want
    vector: "vec",
    // "text" is the default name for the text column in Typesense but you can change it to whatever you want
    pageContent: "text",
    // Names of the columns that you will save in your typesense schema and need to be retrieved as metadata when searching
    metadataColumnNames: ["foo", "bar", "baz"],
  },
  // Optional search parameters to be passed to Typesense when searching
  searchParams: {
    q: "*",
    filter_by: "foo:[fooo]",
    query_by: "",
  },
  // You can override the default Typesense import function if you want to do something more complex
  // Default import function:
  // async importToTypesense<
  //   T extends Record<string, unknown> = Record<string, unknown>
  // >(data: T[], collectionName: string) {
  //   const chunkSize = 2000;
  //   for (let i = 0; i < data.length; i += chunkSize) {
  //     const chunk = data.slice(i, i + chunkSize);

  //     await this.caller.call(async () => {
  //       await this.client
  //         .collections<T>(collectionName)
  //         .documents()
  //         .import(chunk, { action: "emplace", dirty_values: "drop" });
  //     });
  //   }
  // }
  import: async (data, collectionName) => {
    await vectorTypesenseClient
      .collections(collectionName)
      .documents()
      .import(data, { action: "emplace", dirty_values: "drop" });
  },
} satisfies TypesenseConfig;

/**

 * Creates a Typesense vector store from a list of documents.

 * Will update documents if there is a document with the same id, at least with the default import function.

 * @param documents list of documents to create the vector store from

 * @returns Typesense vector store
 */
const createVectorStoreWithTypesense = async (documents: Document[] = []) =>
  Typesense.fromDocuments(
    documents,
    new OpenAIEmbeddings(),
    typesenseVectorStoreConfig
  );

/**

 * Returns a Typesense vector store from an existing index.

 * @returns Typesense vector store
 */
const getVectorStoreWithTypesense = async () =>
  new Typesense(new OpenAIEmbeddings(), typesenseVectorStoreConfig);

// Do a similarity search
const vectorStore = await getVectorStoreWithTypesense();
const documents = await vectorStore.similaritySearch("hello world");

// Add filters based on metadata with the search parameters of Typesense
// will exclude documents with author:JK Rowling, so if Joe Rowling & JK Rowling exists, only Joe Rowling will be returned
vectorStore.similaritySearch("Rowling", undefined, {
  filter_by: "author:!=JK Rowling",
});

// Delete a document
vectorStore.deleteDocuments(["document_id_1", "document_id_2"]);

构造函数

¥Constructor

开始之前，请在 Typesense 中创建一个包含 ID、向量字段和文本字段的架构。根据需要为元数据添加尽可能多的其他字段。

¥Before starting, create a schema in Typesense with an id, a field for the vector and a field for the text. Add as many other fields as needed for the metadata.

constructor(embeddings: Embeddings, config: TypesenseConfig)：构造 Typesense 类的新实例。
¥constructor(embeddings: Embeddings, config: TypesenseConfig): Constructs a new instance of the Typesense class.
- embeddings：用于嵌入文档的 Embeddings 类的实例。
  ¥embeddings: An instance of the Embeddings class used for embedding documents.
- config：Typesense 向量存储的配置对象。
  ¥config: Configuration object for the Typesense vector store.
  - typesenseClient：Typesense 客户端实例。
    ¥typesenseClient: Typesense client instance.
  - schemaName：用于存储和搜索文档的 Typesense 架构名称。
    ¥schemaName: Name of the Typesense schema in which documents will be stored and searched.
  - searchParams（可选）：Typesense 搜索参数。默认为 { q: '*', per_page: 5, query_by: '' }。
    ¥searchParams (optional): Typesense search parameters. Default is { q: '*', per_page: 5, query_by: '' }.
  - columnNames（可选）：列名配置。
    ¥columnNames (optional): Column names configuration.
    - vector（可选）：向量列名。默认为 'vec'。
      ¥vector (optional): Vector column name. Default is 'vec'.
    - pageContent（可选）：页面内容列名称。默认为 'text'。
      ¥pageContent (optional): Page content column name. Default is 'text'.
    - metadataColumnNames（可选）：元数据列名称。默认值为空数组 []。
      ¥metadataColumnNames (optional): Metadata column names. Default is an empty array [].
  - import（可选）：替换了将数据导入 Typesense 的默认导入函数。这可能会影响更新文档的功能。
    ¥import (optional): Replace the default import function for importing data to Typesense. This can affect the functionality of updating documents.

方法

¥Methods

async addDocuments(documents: Document[]): Promise<void>：将文档添加到向量存储。如果存在具有相同 ID 的文档，则会更新文档。
¥async addDocuments(documents: Document[]): Promise<void>: Adds documents to the vector store. The documents will be updated if there is a document with the same ID.
static async fromDocuments(docs: Document[], embeddings: Embeddings, config: TypesenseConfig): Promise<Typesense>：根据文档列表创建 Typesense 向量存储。文档在构建过程中被添加到向量存储中。
¥static async fromDocuments(docs: Document[], embeddings: Embeddings, config: TypesenseConfig): Promise<Typesense>: Creates a Typesense vector store from a list of documents. Documents are added to the vector store during construction.
static async fromTexts(texts: string[], metadatas: object[], embeddings: Embeddings, config: TypesenseConfig): Promise<Typesense>：根据文本和相关元数据列表创建 Typesense 向量存储。文本在构建过程中转换为文档并添加到向量存储中。
¥static async fromTexts(texts: string[], metadatas: object[], embeddings: Embeddings, config: TypesenseConfig): Promise<Typesense>: Creates a Typesense vector store from a list of texts and associated metadata. Texts are converted to documents and added to the vector store during construction.
async similaritySearch(query: string, k?: number, filter?: Record<string, unknown>): Promise<Document[]>：根据查询搜索相似的文档。返回类似文档的数组。
¥async similaritySearch(query: string, k?: number, filter?: Record<string, unknown>): Promise<Document[]>: Searches for similar documents based on a query. Returns an array of similar documents.
async deleteDocuments(documentIds: string[]): Promise<void>：根据文档的 ID 从向量存储中删除文档。
¥async deleteDocuments(documentIds: string[]): Promise<void>: Deletes documents from the vector store based on their IDs.

¥Related

向量存储概念指南
¥Vector store conceptual guide
向量存储操作指南
¥Vector store how-to guides

Typesense

基本用法​

构造函数​

方法​

相关​

基本用法

构造函数

方法

相关