Skip to main content

Xata

Xata 是一个基于 PostgreSQL 的无服务器数据平台。它提供了一个类型安全的 TypeScript/JavaScript SDK 用于与数据库交互,以及一个用于管理数据的 UI。

¥Xata is a serverless data platform, based on PostgreSQL. It provides a type-safe TypeScript/JavaScript SDK for interacting with your database, and a UI for managing your data.

Xata 拥有原生向量类型,可以添加到任何表中,并支持相似性搜索。LangChain 将向量直接插入 Xata,并查询给定向量的最近邻,以便你可以将所有 LangChain 嵌入与 Xata 集成使用。

¥Xata has a native vector type, which can be added to any table, and supports similarity search. LangChain inserts vectors directly to Xata, and queries it for the nearest neighbors of a given vector, so that you can use all the LangChain Embeddings integrations with Xata.

设置

¥Setup

安装 Xata CLI

¥Install the Xata CLI

npm install @xata.io/cli -g

创建一个用作向量存储的数据库

¥Create a database to be used as a vector store

Xata 用户界面 中创建一个新的数据库。你可以随意命名,但在本例中我们将使用 langchain。创建一个表,你可以随意命名,但我们将使用 vectors。通过 UI 添加以下列:

¥In the Xata UI create a new database. You can name it whatever you want, but for this example we'll use langchain. Create a table, again you can name it anything, but we will use vectors. Add the following columns via the UI:

  • "文本" 类型的 content。这用于存储 Document.pageContent 值。

    ¥content of type "Text". This is used to store the Document.pageContent values.

  • "向量" 类型的 embedding。使用你计划使用的模型所使用的维度(OpenAI 为 1536)。

    ¥embedding of type "Vector". Use the dimension used by the model you plan to use (1536 for OpenAI).

  • 任何其他你想要用作元数据的列。它们由 Document.metadata 对象填充。例如,如果在 Document.metadata 对象中你有一个 title 属性,则可以在表中创建一个 title 列,该列将被填充。

    ¥any other columns you want to use as metadata. They are populated from the Document.metadata object. For example, if in the Document.metadata object you have a title property, you can create a title column in the table and it will be populated.

初始化项目

¥Initialize the project

在你的项目中,运行:

¥In your project, run:

xata init

然后选择你上面创建的数据库。这还将生成一个 xata.tsxata.js 文件,用于定义可用于与数据库交互的客户端。有关使用 Xata JavaScript/TypeScript SDK 的更多详细信息,请参阅 Xata 入门文档

¥and then choose the database you created above. This will also generate a xata.ts or xata.js file that defines the client you can use to interact with the database. See the Xata getting started docs for more details on using the Xata JavaScript/TypeScript SDK.

用法

¥Usage

npm install @langchain/openai @langchain/community @langchain/core

示例:使用 OpenAI 和 Xata 作为向量的问答聊天机器人存储

¥Example: Q\&A chatbot using OpenAI and Xata as vector store

本示例使用 VectorDBQAChain 搜索存储在 Xata 中的文档,然后将其作为上下文传递给 OpenAI 模型,以回答用户提出的问题。

¥This example uses the VectorDBQAChain to search the documents stored in Xata and then pass them as context to the OpenAI model, in order to answer the question asked by the user.

import { XataVectorSearch } from "@langchain/community/vectorstores/xata";
import { OpenAIEmbeddings, OpenAI } from "@langchain/openai";
import { BaseClient } from "@xata.io/client";
import { VectorDBQAChain } from "langchain/chains";
import { Document } from "@langchain/core/documents";

// First, follow set-up instructions at
// https://js.langchain.com/docs/modules/data_connection/vectorstores/integrations/xata

// if you use the generated client, you don't need this function.
// Just import getXataClient from the generated xata.ts instead.
const getXataClient = () => {
if (!process.env.XATA_API_KEY) {
throw new Error("XATA_API_KEY not set");
}

if (!process.env.XATA_DB_URL) {
throw new Error("XATA_DB_URL not set");
}
const xata = new BaseClient({
databaseURL: process.env.XATA_DB_URL,
apiKey: process.env.XATA_API_KEY,
branch: process.env.XATA_BRANCH || "main",
});
return xata;
};

export async function run() {
const client = getXataClient();

const table = "vectors";
const embeddings = new OpenAIEmbeddings();
const store = new XataVectorSearch(embeddings, { client, table });

// Add documents
const docs = [
new Document({
pageContent: "Xata is a Serverless Data platform based on PostgreSQL",
}),
new Document({
pageContent:
"Xata offers a built-in vector type that can be used to store and query vectors",
}),
new Document({
pageContent: "Xata includes similarity search",
}),
];

const ids = await store.addDocuments(docs);

// eslint-disable-next-line no-promise-executor-return
await new Promise((r) => setTimeout(r, 2000));

const model = new OpenAI();
const chain = VectorDBQAChain.fromLLM(model, store, {
k: 1,
returnSourceDocuments: true,
});
const response = await chain.invoke({ query: "What is Xata?" });

console.log(JSON.stringify(response, null, 2));

await store.delete({ ids });
}

API Reference:

示例:使用元数据过滤器的相似度搜索

¥Example: Similarity search with a metadata filter

本示例展示如何使用 LangChain.js 和 Xata 实现语义搜索。在运行它之前,请确保在 Xata 的 vectors 表中添加一个 String 类型的 author 列。

¥This example shows how to implement semantic search using LangChain.js and Xata. Before running it, make sure to add an author column of type String to the vectors table in Xata.

import { XataVectorSearch } from "@langchain/community/vectorstores/xata";
import { OpenAIEmbeddings } from "@langchain/openai";
import { BaseClient } from "@xata.io/client";
import { Document } from "@langchain/core/documents";

// First, follow set-up instructions at
// https://js.langchain.com/docs/modules/data_connection/vectorstores/integrations/xata
// Also, add a column named "author" to the "vectors" table.

// if you use the generated client, you don't need this function.
// Just import getXataClient from the generated xata.ts instead.
const getXataClient = () => {
if (!process.env.XATA_API_KEY) {
throw new Error("XATA_API_KEY not set");
}

if (!process.env.XATA_DB_URL) {
throw new Error("XATA_DB_URL not set");
}
const xata = new BaseClient({
databaseURL: process.env.XATA_DB_URL,
apiKey: process.env.XATA_API_KEY,
branch: process.env.XATA_BRANCH || "main",
});
return xata;
};

export async function run() {
const client = getXataClient();
const table = "vectors";
const embeddings = new OpenAIEmbeddings();
const store = new XataVectorSearch(embeddings, { client, table });
// Add documents
const docs = [
new Document({
pageContent: "Xata works great with Langchain.js",
metadata: { author: "Xata" },
}),
new Document({
pageContent: "Xata works great with Langchain",
metadata: { author: "Langchain" },
}),
new Document({
pageContent: "Xata includes similarity search",
metadata: { author: "Xata" },
}),
];
const ids = await store.addDocuments(docs);

// eslint-disable-next-line no-promise-executor-return
await new Promise((r) => setTimeout(r, 2000));

// author is applied as pre-filter to the similarity search
const results = await store.similaritySearchWithScore("xata works great", 6, {
author: "Langchain",
});

console.log(JSON.stringify(results, null, 2));

await store.delete({ ids });
}

API Reference:

¥Related