Runpod

RunPod 提供GPU云基础设施，包括针对部署和扩展AI模型优化的无服务器端点。

本指南介绍了如何使用langchain-runpod集成包将LangChain应用程序连接到托管在RunPod无服务器平台上的模型。

该集成提供了标准语言模型（LLMs）和聊天模型的接口。

安装

安装专用的合作伙伴包：

%pip install -qU langchain-runpod

设置

1. 在RunPod上部署一个端点

导航到您的 RunPod 无服务器控制台。
创建一个“新端点”，选择适合的GPU和模板（例如，vLLM、TGI、text-generation-webui），确保其与您的模型以及预期的输入/输出格式兼容（请参阅组件指南或包README文件）。
配置设置并部署。
至关重要的是，复制部署后的端点ID。

2. 设置API凭据

集成需要您的 RunPod API 密钥和端点 ID。将它们设置为环境变量以确保安全访问：

import getpass
import os

os.environ["RUNPOD_API_KEY"] = getpass.getpass("Enter your RunPod API Key: ")
os.environ["RUNPOD_ENDPOINT_ID"] = input("Enter your RunPod Endpoint ID: ")

(可选) 如果为LLM和Chat模型使用不同的端点，则可能需要设置 RUNPOD_CHAT_ENDPOINT_ID 或在初始化期间直接传递ID。

组件

该软件包提供两个主要组件：

1. 大型语言模型

用于与标准文本补全模型交互。

有关详细用法，请参阅RunPod LLM集成指南

from langchain_runpod import RunPod

# Example initialization (uses environment variables)
llm = RunPod(model_kwargs={"max_new_tokens": 100})  # Add generation params here

# Example Invocation
try:
    response = llm.invoke("Write a short poem about the cloud.")
    print(response)
except Exception as e:
    print(
        f"Error invoking LLM: {e}. Ensure endpoint ID and API key are correct and endpoint is active."
    )

2. 聊天模型

用于与对话模型交互。

有关详细用法和功能支持，请参阅RunPod聊天模型集成指南。

from langchain_core.messages import HumanMessage
from langchain_runpod import ChatRunPod

# Example initialization (uses environment variables)
chat = ChatRunPod(model_kwargs={"temperature": 0.8})  # Add generation params here

# Example Invocation
try:
    response = chat.invoke(
        [HumanMessage(content="Explain RunPod Serverless in one sentence.")]
    )
    print(response.content)
except Exception as e:
    print(
        f"Error invoking Chat Model: {e}. Ensure endpoint ID and API key are correct and endpoint is active."
    )

API 参考：HumanMessage

安装​

设置​

1. 在RunPod上部署一个端点​

2. 设置API凭据​

组件​

1. 大型语言模型​

2. 聊天模型​

安装

设置

1. 在RunPod上部署一个端点

2. 设置API凭据

组件

1. 大型语言模型

2. 聊天模型