Qdrant 的 FastEmbed

FastEmbed from Qdrant is a lightweight, fast, Python library built for embedding generation.

Quantized model weights

ONNX Runtime, no PyTorch dependency

CPU-first design

Data-parallelism for encoding of large datasets.

依赖项

要使用 FastEmbed 与 LangChain，请安装 fastembed Python 包。

%pip install --upgrade --quiet  fastembed

导入

from langchain_community.embeddings.fastembed import FastEmbedEmbeddings

API 参考：FastEmbedEmbeddings

实例化 FastEmbed

参数

model_name: str (默认: "BAAI/bge-small-en-v1.5")

Name of the FastEmbedding model to use. You can find the list of supported models here.
max_length: int（默认：512）

The maximum number of tokens. Unknown behavior for values > 512.
cache_dir: Optional[str] (默认值: 无)

The path to the cache directory. Defaults to local_cache in the parent directory.
threads: Optional[int] (默认值: 无)

The number of threads a single onnxruntime session can use.
doc_embed_type: Literal["default", "passage"] (默认值: "default")

"default": Uses FastEmbed's default embedding method.

"passage": Prefixes the text with "passage" before embedding.
batch_size: int (默认值: 256)

Batch size for encoding. Higher values will use more memory, but be faster.
parallel: Optional[int] (默认值: 无)

If >1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead.

embeddings = FastEmbedEmbeddings()

使用

生成文档嵌入

document_embeddings = embeddings.embed_documents(
    ["This is a document", "This is some other document"]
)

生成查询嵌入

query_embeddings = embeddings.embed_query("This is a query")

嵌入模型概念指南
嵌入模型操操作指南

依赖项​

导入​

实例化 FastEmbed​

参数​

使用​

生成文档嵌入​

生成查询嵌入​

相关​

依赖项

导入

实例化 FastEmbed

参数

使用

生成文档嵌入

生成查询嵌入

相关