Qdrant 的 FastEmbed
FastEmbed from Qdrant is a lightweight, fast, Python library built for embedding generation.
- Quantized model weights
- ONNX Runtime, no PyTorch dependency
- CPU-first design
- Data-parallelism for encoding of large datasets.
依赖项
要使用 FastEmbed 与 LangChain,请安装 fastembed Python 包。
%pip install --upgrade --quiet fastembed
导入
from langchain_community.embeddings.fastembed import FastEmbedEmbeddings
实例化 FastEmbed
参数
-
model_name: str(默认: "BAAI/bge-small-en-v1.5")Name of the FastEmbedding model to use. You can find the list of supported models here.
-
max_length: int(默认:512)The maximum number of tokens. Unknown behavior for values > 512.
-
cache_dir: Optional[str](默认值: 无)The path to the cache directory. Defaults to
local_cachein the parent directory. -
threads: Optional[int](默认值: 无)The number of threads a single onnxruntime session can use.
-
doc_embed_type: Literal["default", "passage"](默认值: "default")"default": Uses FastEmbed's default embedding method.
"passage": Prefixes the text with "passage" before embedding.
-
batch_size: int(默认值: 256)Batch size for encoding. Higher values will use more memory, but be faster.
-
parallel: Optional[int](默认值: 无)If
>1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If0, use all available cores. IfNone, don't use data-parallel processing, use default onnxruntime threading instead.
embeddings = FastEmbedEmbeddings()
使用
生成文档嵌入
document_embeddings = embeddings.embed_documents(
["This is a document", "This is some other document"]
)
生成查询嵌入
query_embeddings = embeddings.embed_query("This is a query")