Skip to main content

Google Cloud SQL for PostgreSQL

Cloud SQL 是一项完全托管的关系数据库服务,提供高性能、无缝集成和卓越的可扩展性,并提供 PostgreSQL 等数据库引擎。

¥Cloud SQL is a fully managed relational database service that offers high performance, seamless integration, and impressive scalability and offers database engines such as PostgreSQL.

本指南简要概述了如何使用 Cloud SQL for PostgreSQL 通过 PostgresVectorStore 类存储向量嵌入。

¥This guide provides a quick overview of how to use Cloud SQL for PostgreSQL to store vector embeddings with the PostgresVectorStore class.

概述

¥Overview

集成详情

¥Integration details

Class Package PY support Package latest
PostgresVectorStore @langchain/google-cloud-sql-pg 0.0.1

开始之前

¥Before you begin

为了使用此软件包,你首先需要执行以下步骤:

¥In order to use this package, you first need to go throught the following steps:

  1. 选择或创建一个云平台项目。

    ¥Select or create a Cloud Platform project.

  2. 为你的项目启用计费功能。

    ¥Enable billing for your project.

  3. 启用 Cloud SQL Admin API。

    ¥Enable the Cloud SQL Admin API.

  4. 设置身份验证。

    ¥Setup Authentication.

  5. 创建 CloudSQL 实例

    ¥Create a CloudSQL instance

  6. 创建 CloudSQL 数据库

    ¥Create a CloudSQL database

  7. 将用户添加到数据库

    ¥Add a user to the database

身份验证

¥Authentication

使用 gcloud auth login 命令在本地验证你的 Google Cloud 账户。

¥Authenticate locally to your Google Cloud account using the gcloud auth login command.

设置你的 Google Cloud 项目

¥Set Your Google Cloud Project

设置你的 Google Cloud 项目 ID 以在本地利用 Google Cloud 资源:

¥Set your Google Cloud project ID to leverage Google Cloud resources locally:

gcloud config set project YOUR-PROJECT-ID

如果你不知道你的项目 ID,请尝试以下操作: 运行 gcloud config list 运行 gcloud projects list。* 请参阅支持页面:查找项目 ID

¥If you don’t know your project ID, try the following: Run gcloud config list. Run gcloud projects list. * See the support page: Locate the project ID.

设置 PostgresVectorStore 实例

¥Setting up a PostgresVectorStore instance

要使用 PostgresVectorStore 库,你需要安装 @langchain/google-cloud-sql-pg 包,然后按照以下步骤操作。

¥To use the PostgresVectorStore library, you’ll need to install the @langchain/google-cloud-sql-pg package and then follow the steps bellow.

首先,你需要登录你的 Google Cloud 账户,并根据你的 Google Cloud 项目设置以下环境变量;这些将根据你想要如何配置 PostgresEngine 实例(fromInstance、fromEngine、fromEngineArgs)来定义:

¥First, you’ll need to log in to your Google Cloud account and set the following environment variables based on your Google Cloud project; these will be defined based on how you want to configure (fromInstance, fromEngine, fromEngineArgs) your PostgresEngine instance :

PROJECT_ID="your-project-id"
REGION="your-project-region" // example: "us-central1"
INSTANCE_NAME="your-instance"
DB_NAME="your-database-name"
DB_USER="your-database-user"
PASSWORD="your-database-password"

设置实例

¥Setting up an instance

要实例化 PostgresVectorStore,你首先需要通过 PostgresEngine 创建数据库连接,然后初始化向量存储表,最后调用 .initialize() 方法实例化向量存储。

¥To instantiate a PostgresVectorStore, you’ll first need to create a database connection through the PostgresEngine, then initialize the vector store table and finally call the .initialize() method to instantiate the vector store.

import {
Column,
PostgresEngine,
PostgresEngineArgs,
PostgresVectorStore,
PostgresVectorStoreArgs,
VectorStoreTableArgs,
} from "@langchain/google-cloud-sql-pg";
import { SyntheticEmbeddings } from "@langchain/core/utils/testing"; // This is used as an Embedding service
import * as dotenv from "dotenv";

dotenv.config();

const peArgs: PostgresEngineArgs = {
user: process.env.DB_USER ?? "",
password: process.env.PASSWORD ?? "",
};

// PostgresEngine instantiation
const engine: PostgresEngine = await PostgresEngine.fromInstance(
process.env.PROJECT_ID ?? "",
process.env.REGION ?? "",
process.env.INSTANCE_NAME ?? "",
process.env.DB_NAME ?? "",
peArgs
);

const vectorStoreArgs: VectorStoreTableArgs = {
metadataColumns: [new Column("page", "TEXT"), new Column("source", "TEXT")],
};

// Vector store table initilization
await engine.initVectorstoreTable("my_vector_store_table", 768, vectorStoreArgs);
const embeddingService = new SyntheticEmbeddings({ vectorSize: 768 });

const pvectorArgs: PostgresVectorStoreArgs = {
metadataColumns: ["page", "source"],
};

// PostgresVectorStore instantiation
const vectorStore = await PostgresVectorStore.initialize(
engine,
embeddingService,
"my_vector_store_table",
pvectorArgs
);

管理向量存储

¥Manage Vector Store

将文档添加到向量存储

¥Add Documents to vector store

要将文档添加到向量存储中,你可以通过传递或不传递 ID 来实现。

¥To add Documents to the vector store, you would be able to it by passing or not the ids

import { v4 as uuidv4 } from "uuid";
import type { Document } from "@langchain/core/documents";

const document1: Document = {
pageContent: "The powerhouse of the cell is the mitochondria",
metadata: { page: 0, source: "https://example.com" },
};

const document2: Document = {
pageContent: "Buildings are made out of brick",
metadata: { page: 1, source: "https://example.com" },
};

const document3: Document = {
pageContent: "Mitochondria are made out of lipids",
metadata: { page: 2, source: "https://example.com" },
};

const document4: Document = {
pageContent: "The 2024 Olympics are in Paris",
metadata: { page: 3, source: "https://example.com" },
};

const documents = [document1, document2, document3, document4];

const ids = [uuidv4(), uuidv4(), uuidv4(), uuidv4()];

await vectorStore.addDocuments(documents, { ids: ids });

从向量存储中删除文档

¥Delete Documents from vector store

你可以通过传递要删除的 ID 数组来从向量存储中删除一个或多个文档:

¥You can delete one or more Documents from the vector store by passing the arrays of ids to be deleted:

// deleting a document
const id1 = ids[0];
await vectorStore.delete({ ids: [id1] });

// deleting more than one document
await vectorStore.delete({ ids: ids });

搜索文档

¥Search for documents

创建向量存储并添加相关文档后,你很可能希望在链或代理运行期间查询它。

¥Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent.

直接查询

¥Query directly

执行简单的相似性搜索可以按如下方式完成:

¥Performing a simple similarity search can be done as follows:

const filter = `"source" = "https://example.com"`;

const results = await vectorStore.similaritySearch("biology", 2, filter);

for (const doc of results) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}

如果你想执行相似性搜索并获取相应的分数,可以运行:

¥If you want to execute a similarity search and receive the corresponding scores you can run:

const filter = `"source" = "https://example.com"`;
const resultsWithScores = await vectorStore.similaritySearchWithScore(
"biology",
2,
filter
);

for (const [doc, score] of resultsWithScores) {
console.log(
`* [SIM=${score.toFixed(3)}] ${doc.pageContent} [${JSON.stringify(doc.metadata)}]`
);
}

¥Query by using the max marginal relevance search

最大边际相关性优化了与查询的相似度以及所选文档之间的多样性。

¥The Maximal marginal relevance optimizes for similarity to the query and diversity among selected documents.

const options = {
k: 4,
filter: `"source" = 'https://example.com'`,
};

const results = await vectorStoreInstance.maxMarginalRelevanceSearch("biology", options);

for (const doc of results) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}