上传带有痕迹的文件
在深入学习本内容之前,建议先阅读以下指南:
以下功能在以下 SDK 版本中可用:
- Python SDK:>=0.1.141
- JS/TS SDK:≥0.2.5
LangSmith 支持在追踪记录中上传二进制文件(例如图像、音频、视频、PDF 和 CSV 文件)。这在使用多模态输入或输出的大型语言模型(LLM)流水线时尤为有用。
在 Python 和 TypeScript SDK 中,您可以通过指定每个文件的 MIME 类型和二进制内容,将附件添加到追踪记录中。
本指南将介绍如何在 Python 中使用 Attachment 类型、在 TypeScript 中使用 Uint8Array / ArrayBuffer 类型来定义和追踪附件。
- Python
- TypeScript
在 Python SDK 中,您可以使用 Attachment 类型将文件添加到追踪记录中。
每个 Attachment 都需要:
mime_type(字符串):文件的 MIME 类型(例如,"image/png")。data(字节 | 路径):文件的二进制内容,或文件路径。
您还可以方便地使用形如 (mime_type, data) 的元组来定义附件。
只需使用 @traceable 装饰一个函数,并将您的 Attachment 实例作为参数传入。
请注意,若要使用文件路径而非原始字节,您需要在可追踪装饰器(traceable decorator)中将 dangerously_allow_filesystem 标志设置为 True。
from langsmith import traceable
from langsmith.schemas import Attachment
from pathlib import Path
import os
# Must set dangerously_allow_filesystem to True if you want to use file paths
@traceable(dangerously_allow_filesystem=True)
def trace_with_attachments(
val: int,
text: str,
image: Attachment,
audio: Attachment,
video: Attachment,
pdf: Attachment,
csv: Attachment,
):
return f"Processed: {val}, {text}, {len(image.data)}, {len(audio.data)}, {len(video.data)}, {len(pdf.data), {len(csv.data)}}"
# Helper function to load files as bytes
def load_file(file_path: str) -> bytes:
with open(file_path, "rb") as f:
return f.read()
# Load files and create attachments
image_data = load_file("my_image.png")
audio_data = load_file("my_mp3.mp3")
video_data = load_file("my_video.mp4")
pdf_data = load_file("my_document.pdf")
image_attachment = Attachment(mime_type="image/png", data=image_data)
audio_attachment = Attachment(mime_type="audio/mpeg", data=audio_data)
video_attachment = Attachment(mime_type="video/mp4", data=video_data)
pdf_attachment = ("application/pdf", pdf_data) # Can just define as tuple of (mime_type, data)
csv_attachment = Attachment(mime_type="text/csv", data=Path(os.getcwd()) / "my_csv.csv")
# Define other parameters
val = 42
text = "Hello, world!"
# Call the function with traced attachments
result = trace_with_attachments(
val=val,
text=text,
image=image_attachment,
audio=audio_attachment,
video=video_attachment,
pdf=pdf_attachment,
csv=csv_attachment,
)
在 TypeScript SDK 中,您可以通过使用 Uint8Array 或 ArrayBuffer 作为数据类型,为追踪(traces)添加附件。
每个附件的 MIME 类型均在 extractAttachments 中指定:
Uint8Array:适用于直接处理二进制数据。ArrayBuffer:表示定长二进制数据,可根据需要转换为Uint8Array。
使用 traceable 包裹您的函数,并将附件包含在 extractAttachments 选项中。
在 TypeScript SDK 中,extractAttachments 函数是 traceable 配置中的一个可选参数。当调用经过 traceable 封装的函数时,它会从您的输入中提取二进制数据(例如图像、音频文件),并将其连同其他追踪数据一并记录,同时指定其 MIME 类型。
请注意,在 TypeScript SDK 中,您无法直接传入文件路径,因为并非所有运行时环境都支持访问本地文件。
type AttachmentData = Uint8Array | ArrayBuffer;
type Attachments = Record<string, [string, AttachmentData]>;
extractAttachments?: (
...args: Parameters<Func>
) => [Attachments | undefined, KVMap];
import { traceable } from "langsmith/traceable";
const traceableWithAttachments = traceable(
(
val: number,
text: string,
attachment: Uint8Array,
attachment2: ArrayBuffer,
attachment3: Uint8Array,
attachment4: ArrayBuffer,
attachment5: Uint8Array,
) =>
`Processed: ${val}, ${text}, ${attachment.length}, ${attachment2.byteLength}, ${attachment3.length}, ${attachment4.byteLength}, ${attachment5.byteLength}`,
{
name: "traceWithAttachments",
extractAttachments: (
val: number,
text: string,
attachment: Uint8Array,
attachment2: ArrayBuffer,
attachment3: Uint8Array,
attachment4: ArrayBuffer,
attachment5: Uint8Array,
) => [
{
"image inputs": ["image/png", attachment],
"mp3 inputs": ["audio/mpeg", new Uint8Array(attachment2)],
"video inputs": ["video/mp4", attachment3],
"pdf inputs": ["application/pdf", new Uint8Array(attachment4)],
"csv inputs": ["text/csv", new Uint8Array(attachment5)]
},
{ val, text },
],
}
);
const fs = Deno // or Node.js fs module
const image = await fs.readFile("my_image.png"); // Uint8Array
const mp3Buffer = await fs.readFile("my_mp3.mp3");
const mp3ArrayBuffer = mp3Buffer.buffer; // Convert to ArrayBuffer
const video = await fs.readFile("my_video.mp4"); // Uint8Array
const pdfBuffer = await fs.readFile("my_document.pdf");
const pdfArrayBuffer = pdfBuffer.buffer; // Convert to ArrayBuffer
const csv = await fs.readFile("test-vals.csv"); // Uint8Array
// Define example parameters
const val = 42;
const text = "Hello, world!";
// Call traceableWithAttachments with the files
const result = await traceableWithAttachments(val, text, image, mp3ArrayBuffer, video, pdfArrayBuffer, csv);
以上内容在LangSmith用户界面中的显示效果如下。您可以展开每个附件以查看其内容。
