Skip to main content

SingleStore

SingleStoreDB 是一个强大的高性能分布式 SQL 数据库解决方案,旨在在 cloud 和本地环境中均表现出色。它拥有丰富的功能集,提供无缝的部署选项,同时提供无与伦比的性能。

¥SingleStoreDB is a robust, high-performance distributed SQL database solution designed to excel in both cloud and on-premises environments. Boasting a versatile feature set, it offers seamless deployment options while delivering unparalleled performance.

SingleStoreDB 的一个突出特点是它对向量存储和操作的高级支持,使其成为需要复杂 AI 功能(例如文本相似性匹配)的应用的理想选择。借助 dot_producteuclidean_distance 等内置向量函数,SingleStoreDB 使开发者能够高效地实现复杂的算法。

¥A standout feature of SingleStoreDB is its advanced support for vector storage and operations, making it an ideal choice for applications requiring intricate AI capabilities such as text similarity matching. With built-in vector functions like dot_product and euclidean_distance, SingleStoreDB empowers developers to implement sophisticated algorithms efficiently.

对于热衷于在 SingleStoreDB 中利用矢量数据的开发者,我们提供了一个全面的教程,指导他们了解 使用向量数据 的复杂性。本教程深入探讨了 SingleStoreDB 中的向量存储,展示了其基于向量相似性进行搜索的能力。利用向量索引,查询可以以惊人的速度执行,从而快速检索相关数据。

¥For developers keen on leveraging vector data within SingleStoreDB, a comprehensive tutorial is available, guiding them through the intricacies of working with vector data. This tutorial delves into the Vector Store within SingleStoreDB, showcasing its ability to facilitate searches based on vector similarity. Leveraging vector indexes, queries can be executed with remarkable speed, enabling swift retrieval of relevant data.

此外,SingleStoreDB 的向量存储与 基于 Lucene 的全文索引 无缝集成,支持强大的文本相似性搜索。用户可以根据文档元数据对象的选定字段筛选搜索结果,从而提高查询精度。

¥Moreover, SingleStoreDB's Vector Store seamlessly integrates with full-text indexing based on Lucene, enabling powerful text similarity searches. Users can filter search results based on selected fields of document metadata objects, enhancing query precision.

SingleStoreDB 的独特之处在于它能够以各种方式结合向量搜索和全文搜索,从而提供灵活性和多功能性。无论是通过文本或向量相似度进行预过滤并选择最相关的数据,还是采用加权求和方法计算最终的相似度得分,开发者都有多种选择。

¥What sets SingleStoreDB apart is its ability to combine vector and full-text searches in various ways, offering flexibility and versatility. Whether prefiltering by text or vector similarity and selecting the most relevant data, or employing a weighted sum approach to compute a final similarity score, developers have multiple options at their disposal.

本质上,SingleStoreDB 提供了一个用于管理和查询矢量数据的全面解决方案,为 AI 驱动的应用提供了无与伦比的性能和灵活性。

¥In essence, SingleStoreDB provides a comprehensive solution for managing and querying vector data, offering unparalleled performance and flexibility for AI-driven applications.

Compatibility

仅在 Node.js 上可用。

¥Only available on Node.js.

LangChain.js 需要 mysql2 库来创建与 SingleStoreDB 实例的连接。

¥LangChain.js requires the mysql2 library to create a connection to a SingleStoreDB instance.

设置

¥Setup

  1. 建立 SingleStoreDB 环境。你可以灵活地选择 基于云的本地部署 版本。

    ¥Establish a SingleStoreDB environment. You have the flexibility to choose between Cloud-based or On-Premise editions.

  2. 安装 mysql2 JS 客户端

    ¥Install the mysql2 JS client

npm install -S mysql2

用法

¥Usage

SingleStoreVectorStore 管理连接池。建议在终止应用之前调用 await store.end();,以确保所有连接都已正确关闭,并防止任何可能的资源泄漏。

¥SingleStoreVectorStore manages a connection pool. It is recommended to call await store.end(); before terminating your application to assure all connections are appropriately closed and prevent any possible resource leaks.

标准用法

¥Standard usage

npm install @langchain/openai @langchain/community @langchain/core

下面是一个简单的示例,展示了如何导入相关模块并使用 SingleStoreVectorStore 执行基本相似性搜索:

¥Below is a straightforward example showcasing how to import the relevant module and perform a base similarity search using the SingleStoreVectorStore:

import { SingleStoreVectorStore } from "@langchain/community/vectorstores/singlestore";
import { OpenAIEmbeddings } from "@langchain/openai";

export const run = async () => {
const vectorStore = await SingleStoreVectorStore.fromTexts(
["Hello world", "Bye bye", "hello nice world"],
[{ id: 2 }, { id: 1 }, { id: 3 }],
new OpenAIEmbeddings(),
{
connectionOptions: {
host: process.env.SINGLESTORE_HOST,
port: Number(process.env.SINGLESTORE_PORT),
user: process.env.SINGLESTORE_USERNAME,
password: process.env.SINGLESTORE_PASSWORD,
database: process.env.SINGLESTORE_DATABASE,
},
}
);

const resultOne = await vectorStore.similaritySearch("hello world", 1);
console.log(resultOne);
await vectorStore.end();
};

API Reference:

元数据过滤

¥Metadata Filtering

如果需要根据特定元数据字段过滤结果,你可以传递一个过滤器参数,将搜索范围缩小到与过滤器对象中所有指定字段匹配的文档:

¥If it is needed to filter results based on specific metadata fields, you can pass a filter parameter to narrow down your search to the documents that match all specified fields in the filter object:

import { SingleStoreVectorStore } from "@langchain/community/vectorstores/singlestore";
import { OpenAIEmbeddings } from "@langchain/openai";

export const run = async () => {
const vectorStore = await SingleStoreVectorStore.fromTexts(
["Good afternoon", "Bye bye", "Boa tarde!", "Até logo!"],
[
{ id: 1, language: "English" },
{ id: 2, language: "English" },
{ id: 3, language: "Portugese" },
{ id: 4, language: "Portugese" },
],
new OpenAIEmbeddings(),
{
connectionOptions: {
host: process.env.SINGLESTORE_HOST,
port: Number(process.env.SINGLESTORE_PORT),
user: process.env.SINGLESTORE_USERNAME,
password: process.env.SINGLESTORE_PASSWORD,
database: process.env.SINGLESTORE_DATABASE,
},
distanceMetric: "EUCLIDEAN_DISTANCE",
}
);

const resultOne = await vectorStore.similaritySearch("greetings", 1, {
language: "Portugese",
});
console.log(resultOne);
await vectorStore.end();
};

API Reference:

向量索引

¥Vector indexes

使用 SingleStore DB 8.5 或更高版本,并利用 ANN 向量索引 提升你的搜索效率。通过在向量存储对象创建期间设置 useVectorIndex: true,你可以激活此功能。此外,如果你的向量维度与 OpenAI 默认的 1536 嵌入大小不同,请确保相应地指定 vectorSize 参数。

¥Enhance your search efficiency with SingleStore DB version 8.5 or above by leveraging ANN vector indexes. By setting useVectorIndex: true during vector store object creation, you can activate this feature. Additionally, if your vectors differ in dimensionality from the default OpenAI embedding size of 1536, ensure to specify the vectorSize parameter accordingly.

¥Hybrid search

SingleStoreDB 提供多种搜索策略,每种策略都经过精心设计,以满足特定的用例和用户偏好。默认的 VECTOR_ONLY 策略利用向量操作(例如 DOT_PRODUCTEUCLIDEAN_DISTANCE)直接计算向量之间的相似度得分,而 TEXT_ONLY 则采用基于 Lucene 的全文搜索,这对于以文本为中心的应用尤其有利。对于寻求平衡方法的用户,FILTER_BY_TEXT 首先根据文本相似度优化结果,然后再进行向量比较;而 FILTER_BY_VECTOR 则优先考虑向量相似度,在评估文本相似度以获得最佳匹配之前先筛选结果。需要注意的是,FILTER_BY_TEXTFILTER_BY_VECTOR 都需要全文索引才能运行。此外,WEIGHTED_SUM 是一种复杂的策略,它通过权衡向量和文本的相似性来计算最终的相似度得分,尽管它只使用了点积距离计算,并且还需要全文索引。这些多功能策略使用户能够根据其独特需求微调搜索,从而实现高效、精确的数据检索和分析。此外,SingleStoreDB 的混合方法(以 FILTER_BY_TEXTFILTER_BY_VECTORWEIGHTED_SUM 策略为例)无缝融合了向量搜索和基于文本的搜索,从而最大限度地提高了效率和准确性,确保用户可以充分利用该平台的功能,实现广泛的应用。

¥SingleStoreDB presents a diverse range of search strategies, each meticulously crafted to cater to specific use cases and user preferences. The default VECTOR_ONLY strategy utilizes vector operations such as DOT_PRODUCT or EUCLIDEAN_DISTANCE to calculate similarity scores directly between vectors, while TEXT_ONLY employs Lucene-based full-text search, particularly advantageous for text-centric applications. For users seeking a balanced approach, FILTER_BY_TEXT first refines results based on text similarity before conducting vector comparisons, whereas FILTER_BY_VECTOR prioritizes vector similarity, filtering results before assessing text similarity for optimal matches. Notably, both FILTER_BY_TEXT and FILTER_BY_VECTOR necessitate a full-text index for operation. Additionally, WEIGHTED_SUM emerges as a sophisticated strategy, calculating the final similarity score by weighing vector and text similarities, albeit exclusively utilizing dot_product distance calculations and also requiring a full-text index. These versatile strategies empower users to fine-tune searches according to their unique needs, facilitating efficient and precise data retrieval and analysis. Moreover, SingleStoreDB's hybrid approaches, exemplified by FILTER_BY_TEXT, FILTER_BY_VECTOR, and WEIGHTED_SUM strategies, seamlessly blend vector and text-based searches to maximize efficiency and accuracy, ensuring users can fully leverage the platform's capabilities for a wide range of applications.

import { SingleStoreVectorStore } from "@langchain/community/vectorstores/singlestore";
import { OpenAIEmbeddings } from "@langchain/openai";

export const run = async () => {
const vectorStore = await SingleStoreVectorStore.fromTexts(
[
"In the parched desert, a sudden rainstorm brought relief, as the droplets danced upon the thirsty earth, rejuvenating the landscape with the sweet scent of petrichor.",
"Amidst the bustling cityscape, the rain fell relentlessly, creating a symphony of pitter-patter on the pavement, while umbrellas bloomed like colorful flowers in a sea of gray.",
"High in the mountains, the rain transformed into a delicate mist, enveloping the peaks in a mystical veil, where each droplet seemed to whisper secrets to the ancient rocks below.",
"Blanketing the countryside in a soft, pristine layer, the snowfall painted a serene tableau, muffling the world in a tranquil hush as delicate flakes settled upon the branches of trees like nature's own lacework.",
"In the urban landscape, snow descended, transforming bustling streets into a winter wonderland, where the laughter of children echoed amidst the flurry of snowballs and the twinkle of holiday lights.",
"Atop the rugged peaks, snow fell with an unyielding intensity, sculpting the landscape into a pristine alpine paradise, where the frozen crystals shimmered under the moonlight, casting a spell of enchantment over the wilderness below.",
],
[
{ category: "rain" },
{ category: "rain" },
{ category: "rain" },
{ category: "snow" },
{ category: "snow" },
{ category: "snow" },
],
new OpenAIEmbeddings(),
{
connectionOptions: {
host: process.env.SINGLESTORE_HOST,
port: Number(process.env.SINGLESTORE_PORT),
user: process.env.SINGLESTORE_USERNAME,
password: process.env.SINGLESTORE_PASSWORD,
database: process.env.SINGLESTORE_DATABASE,
},
distanceMetric: "DOT_PRODUCT",
useVectorIndex: true,
useFullTextIndex: true,
}
);

const resultOne = await vectorStore.similaritySearch(
"rainstorm in parched desert, rain",
1,
{ category: "rain" }
);
console.log(resultOne[0].pageContent);

await vectorStore.setSearchConfig({
searchStrategy: "TEXT_ONLY",
});
const resultTwo = await vectorStore.similaritySearch(
"rainstorm in parched desert, rain",
1
);
console.log(resultTwo[0].pageContent);

await vectorStore.setSearchConfig({
searchStrategy: "FILTER_BY_TEXT",
filterThreshold: 0.1,
});
const resultThree = await vectorStore.similaritySearch(
"rainstorm in parched desert, rain",
1
);
console.log(resultThree[0].pageContent);

await vectorStore.setSearchConfig({
searchStrategy: "FILTER_BY_VECTOR",
filterThreshold: 0.1,
});
const resultFour = await vectorStore.similaritySearch(
"rainstorm in parched desert, rain",
1
);
console.log(resultFour[0].pageContent);

await vectorStore.setSearchConfig({
searchStrategy: "WEIGHTED_SUM",
textWeight: 0.2,
vectorWeight: 0.8,
vectorselectCountMultiplier: 10,
});
const resultFive = await vectorStore.similaritySearch(
"rainstorm in parched desert, rain",
1
);
console.log(resultFive[0].pageContent);

await vectorStore.end();
};

API Reference:

¥Related