Skip to main content
Open In ColabOpen on GitHub

AzureMLChatOnlineEndpoint

Azure Machine Learning is a platform used to build, train, and deploy machine learning models. Users can explore the types of models to deploy in the Model Catalog, which provides foundational and general purpose models from different providers.

In general, you need to deploy models in order to consume its predictions (inference). In Azure Machine Learning, Online Endpoints are used to deploy these models with a real-time serving. They are based on the ideas of Endpoints and Deployments which allow you to decouple the interface of your production workload from the implementation that serves it.

本笔记本介绍了如何使用托管在 Azure Machine Learning Endpoint 上的聊天模型。

from langchain_community.chat_models.azureml_endpoint import AzureMLChatOnlineEndpoint

设置

您必须在Azure ML上部署模型部署到Azure AI Studio并获取以下参数:

  • endpoint_url: 端点提供的REST端点URL。
  • endpoint_api_type: 在将模型部署到专用端点(托管的基础设施)时使用endpoint_type='dedicated'。在使用按量付费方式(模型即服务)部署模型时使用endpoint_type='serverless'
  • endpoint_api_key: 端点提供的API密钥

内容格式化器

content_formatter 参数是一个处理程序类,用于转换 AzureML 端点的请求和响应,以匹配所需的架构。由于模型目录中有各种各样的模型,每个模型处理数据的方式可能彼此不同,因此提供了一个 ContentFormatterBase 类,允许用户按自己的喜好转换数据。以下是提供的内容格式化程序:

  • CustomOpenAIChatContentFormatter: 格式化像LLaMa2-chat这样遵循OpenAI API请求和响应规范的模型的请求和响应数据。

注意: langchain.chat_models.azureml_endpoint.LlamaChatContentFormatter 正在被弃用,并替换为 langchain.chat_models.azureml_endpoint.CustomOpenAIChatContentFormatter

您可以实现从类 langchain_community.llms.azureml_endpoint.ContentFormatterBase 派生的特定于您的模型的自定义内容格式化程序。

示例

以下部分包含有关如何使用此类的示例:

示例:通过实时端点完成聊天

from langchain_community.chat_models.azureml_endpoint import (
AzureMLEndpointApiType,
CustomOpenAIChatContentFormatter,
)
from langchain_core.messages import HumanMessage

chat = AzureMLChatOnlineEndpoint(
endpoint_url="https://<your-endpoint>.<your_region>.inference.ml.azure.com/score",
endpoint_api_type=AzureMLEndpointApiType.dedicated,
endpoint_api_key="my-api-key",
content_formatter=CustomOpenAIChatContentFormatter(),
)
response = chat.invoke(
[HumanMessage(content="Will the Collatz conjecture ever be solved?")]
)
response
AIMessage(content='  The Collatz Conjecture is one of the most famous unsolved problems in mathematics, and it has been the subject of much study and research for many years. While it is impossible to predict with certainty whether the conjecture will ever be solved, there are several reasons why it is considered a challenging and important problem:\n\n1. Simple yet elusive: The Collatz Conjecture is a deceptively simple statement that has proven to be extraordinarily difficult to prove or disprove. Despite its simplicity, the conjecture has eluded some of the brightest minds in mathematics, and it remains one of the most famous open problems in the field.\n2. Wide-ranging implications: The Collatz Conjecture has far-reaching implications for many areas of mathematics, including number theory, algebra, and analysis. A solution to the conjecture could have significant impacts on these fields and potentially lead to new insights and discoveries.\n3. Computational evidence: While the conjecture remains unproven, extensive computational evidence supports its validity. In fact, no counterexample to the conjecture has been found for any starting value up to 2^64 (a number', additional_kwargs={}, example=False)

示例:使用按需部署的聊天补全(模型即服务)

chat = AzureMLChatOnlineEndpoint(
endpoint_url="https://<your-endpoint>.<your_region>.inference.ml.azure.com/v1/chat/completions",
endpoint_api_type=AzureMLEndpointApiType.serverless,
endpoint_api_key="my-api-key",
content_formatter=CustomOpenAIChatContentFormatter,
)
response = chat.invoke(
[HumanMessage(content="Will the Collatz conjecture ever be solved?")]
)
response

如果需要向模型传递额外参数,请使用 model_kwargs 参数:

chat = AzureMLChatOnlineEndpoint(
endpoint_url="https://<your-endpoint>.<your_region>.inference.ml.azure.com/v1/chat/completions",
endpoint_api_type=AzureMLEndpointApiType.serverless,
endpoint_api_key="my-api-key",
content_formatter=CustomOpenAIChatContentFormatter,
model_kwargs={"temperature": 0.8},
)

参数也可以在调用期间传递:

response = chat.invoke(
[HumanMessage(content="Will the Collatz conjecture ever be solved?")],
max_tokens=512,
)
response