Skip to main content

Notion API

本指南将指导你完成使用 Notion API 从 Notion 页面和数据库加载文档所需的步骤。

¥This guide will take you through the steps required to load documents from Notion pages and databases using the Notion API.

概述

¥Overview

Notion 是一个多功能的生产力平台,它将注意记录、任务管理和数据组织工具整合到一个界面中。

¥Notion is a versatile productivity platform that consolidates note-taking, task management, and data organization tools into one interface.

此文档加载器能够获取完整的 Notion 页面和数据库,并将其转换为可集成到项目中的 LangChain 文档。

¥This document loader is able to take full Notion pages and databases and turn them into a LangChain Documents ready to be integrated into your projects.

设置

¥Setup

  1. 你首先需要安装官方 Notion 客户端和 notion-to-md 包作为对等依赖:

    ¥You will first need to install the official Notion client and the notion-to-md package as peer dependencies:

npm install @langchain/community @langchain/core @notionhq/client notion-to-md
  1. 创建一个 Notion 集成 并安全记录内部集成密钥(也称为 NOTION_INTEGRATION_TOKEN)。

    ¥Create a Notion integration and securely record the Internal Integration Secret (also known as NOTION_INTEGRATION_TOKEN).

  2. 在你的页面或数据库上添加与新集成的连接。为此,请打开你的 Notion 页面,转到右上角的设置点,向下滚动到 Add connections 并选择你的新集成。

    ¥Add a connection to your new integration on your page or database. To do this open your Notion page, go to the settings pips in the top right and scroll down to Add connections and select your new integration.

  3. 获取你要加载的页面或数据库的 PAGE_IDDATABASE_ID

    ¥Get the PAGE_ID or DATABASE_ID for the page or database you want to load.

URL 路径中的 32 个十六进制字符代表 ID。例如:

¥The 32 char hex in the url path represents the ID. For example:

PAGE_ID:[https://www.notion.so/skarard/LangChain-Notion-API-](https://www.notion.so/skarard/LangChain-Notion-API-)b34ca03f219c4420a6046fc4bdfdf7b4

DATABASE_ID:[https://www.notion.so/skarard/](https://www.notion.so/skarard/)c393f19c3903440da0d34bf9c6c12ff2?v=9c70a0f4e174498aa0f9021e0a9d52de

REGEX:/(?<!=)[0-9a-f]{32}/

使用示例

¥Example Usage

import { NotionAPILoader } from "@langchain/community/document_loaders/web/notionapi";
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";

// Loading a page (including child pages all as separate documents)
const pageLoader = new NotionAPILoader({
clientOptions: {
auth: "<NOTION_INTEGRATION_TOKEN>",
},
id: "<PAGE_ID>",
type: "page",
});

const splitter = new RecursiveCharacterTextSplitter();

// Load the documents
const pageDocs = await pageLoader.load();
// Split the documents using the text splitter
const splitDocs = await splitter.splitDocuments(pageDocs);

console.log({ splitDocs });

// Loading a database (each row is a separate document with all properties as metadata)
const dbLoader = new NotionAPILoader({
clientOptions: {
auth: "<NOTION_INTEGRATION_TOKEN>",
},
id: "<DATABASE_ID>",
type: "database",
onDocumentLoaded: (current, total, currentTitle) => {
console.log(`Loaded Page: ${currentTitle} (${current}/${total})`);
},
callerOptions: {
maxConcurrency: 64, // Default value
},
propertiesAsHeader: true, // Prepends a front matter header of the page properties to the page contents
});

// A database row contents is likely to be less than 1000 characters so it's not split into multiple documents
const dbDocs = await dbLoader.load();

console.log({ dbDocs });

API Reference: