Skip to main content
Open on GitHub

LlamaEdge

LlamaEdge is the easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge.

  • Lightweight inference apps. LlamaEdge is in MBs instead of GBs
  • Native and GPU accelerated performance
  • Supports many GPU and hardware accelerators
  • Supports many optimized inference libraries
  • Wide selection of AI / LLM models

安装与设置

查看 安装说明

聊天模型

查看一个 使用示例

from langchain_community.chat_models.llama_edge import LlamaEdgeChatService