LlamaEdge

LlamaEdge is the easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge.

Lightweight inference apps. LlamaEdge is in MBs instead of GBs

Native and GPU accelerated performance

Supports many GPU and hardware accelerators

Supports many optimized inference libraries

Wide selection of AI / LLM models

安装与设置

查看安装说明。

聊天模型

查看一个使用示例。

from langchain_community.chat_models.llama_edge import LlamaEdgeChatService

API 参考：LlamaEdgeChatService

安装与设置​

聊天模型​

安装与设置

聊天模型