GMail

此加载器介绍了如何从 GMail 加载数据。从 GMail 加载数据的方式有很多，当前的加载器在这方面有一定的倾向性。它首先查找所有你发送过的邮件，然后查找那些你正在回复的邮件。接着获取那封之前的邮件，并将其与你的邮件一起创建为一个训练示例。

需要注意的是，这里存在明显的局限性。例如，所有示例创建时都只考虑了前一封邮件的上下文。

使用方法：

设置一个 Google 开发者账户：前往 Google 开发者控制台，创建一个项目，并为该项目启用 Gmail API。这将为您提供一个稍后需要的 credentials.json 文件。
安装 Google 客户端库：运行以下命令以安装 Google 客户端库：

%pip install --upgrade --quiet  google-auth google-auth-oauthlib google-auth-httplib2 google-api-python-client

import os.path

from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow

SCOPES = ["https://www.googleapis.com/auth/gmail.readonly"]


creds = None
# The file token.json stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists("email_token.json"):
    creds = Credentials.from_authorized_user_file("email_token.json", SCOPES)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
    if creds and creds.expired and creds.refresh_token:
        creds.refresh(Request())
    else:
        flow = InstalledAppFlow.from_client_secrets_file(
            # your creds file here. Please create json file as here https://cloud.google.com/docs/authentication/getting-started
            "creds.json",
            SCOPES,
        )
        creds = flow.run_local_server(port=0)
    # Save the credentials for the next run
    with open("email_token.json", "w") as token:
        token.write(creds.to_json())

from langchain_community.chat_loaders.gmail import GMailLoader

API 参考：GMailLoader

loader = GMailLoader(creds=creds, n=3)

data = loader.load()

# Sometimes there can be errors which we silently ignore
len(data)

from langchain_community.chat_loaders.utils import (
    map_ai_messages,
)

API 参考：map_ai_messages

# This makes messages sent by hchase@langchain.com the AI Messages
# This means you will train an LLM to predict as if it's responding as hchase
training_data = list(
    map_ai_messages(data, sender="Harrison Chase <hchase@langchain.com>")
)