LlamaEdge
LlamaEdge is the easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge.
- Lightweight inference apps.
LlamaEdgeis in MBs instead of GBs- Native and GPU accelerated performance
- Supports many GPU and hardware accelerators
- Supports many optimized inference libraries
- Wide selection of AI / LLM models
安装与设置
查看 安装说明。
聊天模型
查看一个 使用示例。
from langchain_community.chat_models.llama_edge import LlamaEdgeChatService
API 参考:LlamaEdgeChatService