Notion API
本指南将指导你完成使用 Notion API 从 Notion 页面和数据库加载文档所需的步骤。
¥This guide will take you through the steps required to load documents from Notion pages and databases using the Notion API.
概述
¥Overview
Notion 是一个多功能的生产力平台,它将注意记录、任务管理和数据组织工具整合到一个界面中。
¥Notion is a versatile productivity platform that consolidates note-taking, task management, and data organization tools into one interface.
此文档加载器能够获取完整的 Notion 页面和数据库,并将其转换为可集成到项目中的 LangChain 文档。
¥This document loader is able to take full Notion pages and databases and turn them into a LangChain Documents ready to be integrated into your projects.
设置
¥Setup
你首先需要安装官方 Notion 客户端和 notion-to-md 包作为对等依赖:
¥You will first need to install the official Notion client and the notion-to-md package as peer dependencies:
- npm
- Yarn
- pnpm
npm install @langchain/community @langchain/core @notionhq/client notion-to-md
yarn add @langchain/community @langchain/core @notionhq/client notion-to-md
pnpm add @langchain/community @langchain/core @notionhq/client notion-to-md
创建一个 Notion 集成 并安全记录内部集成密钥(也称为
NOTION_INTEGRATION_TOKEN)。¥Create a Notion integration and securely record the Internal Integration Secret (also known as
NOTION_INTEGRATION_TOKEN).在你的页面或数据库上添加与新集成的连接。为此,请打开你的 Notion 页面,转到右上角的设置点,向下滚动到
Add connections并选择你的新集成。¥Add a connection to your new integration on your page or database. To do this open your Notion page, go to the settings pips in the top right and scroll down to
Add connectionsand select your new integration.获取你要加载的页面或数据库的
PAGE_ID或DATABASE_ID。¥Get the
PAGE_IDorDATABASE_IDfor the page or database you want to load.
URL 路径中的 32 个十六进制字符代表
ID。例如:¥The 32 char hex in the url path represents the
ID. For example:
REGEX:
/(?<!=)[0-9a-f]{32}/
使用示例
¥Example Usage
import { NotionAPILoader } from "@langchain/community/document_loaders/web/notionapi";
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
// Loading a page (including child pages all as separate documents)
const pageLoader = new NotionAPILoader({
clientOptions: {
auth: "<NOTION_INTEGRATION_TOKEN>",
},
id: "<PAGE_ID>",
type: "page",
});
const splitter = new RecursiveCharacterTextSplitter();
// Load the documents
const pageDocs = await pageLoader.load();
// Split the documents using the text splitter
const splitDocs = await splitter.splitDocuments(pageDocs);
console.log({ splitDocs });
// Loading a database (each row is a separate document with all properties as metadata)
const dbLoader = new NotionAPILoader({
clientOptions: {
auth: "<NOTION_INTEGRATION_TOKEN>",
},
id: "<DATABASE_ID>",
type: "database",
onDocumentLoaded: (current, total, currentTitle) => {
console.log(`Loaded Page: ${currentTitle} (${current}/${total})`);
},
callerOptions: {
maxConcurrency: 64, // Default value
},
propertiesAsHeader: true, // Prepends a front matter header of the page properties to the page contents
});
// A database row contents is likely to be less than 1000 characters so it's not split into multiple documents
const dbDocs = await dbLoader.load();
console.log({ dbDocs });
API Reference:
- NotionAPILoader from
@langchain/community/document_loaders/web/notionapi - RecursiveCharacterTextSplitter from
@langchain/textsplitters