Skip to main content
Open In ColabOpen on GitHub

底格里斯河

Tigris is an open-source Serverless NoSQL Database and Search Platform designed to simplify building high-performance vector search applications. Tigris eliminates the infrastructure complexity of managing, operating, and synchronizing multiple tools, allowing you to focus on building great applications instead.

本笔记本指导您如何将 Tigris 用作您的向量存储。

先决条件

  1. 一个OpenAI账户。您可以点击这里注册一个账户
  2. 注册一个免费的Tigris账户。注册Tigris账户后,创建一个名为vectordemo的新项目。接下来,请记下您所在区域的UriclientIdclientSecret。您可以在项目的Application Keys部分获取所有这些信息。

首先让我们安装依赖项:

%pip install --upgrade --quiet  tigrisdb openapi-schema-pydantic langchain-openai langchain-community tiktoken

我们将加载环境中的 OpenAI API 密钥和 Tigris 凭证

import getpass
import os

if "OPENAI_API_KEY" not in os.environ:
os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")
if "TIGRIS_PROJECT" not in os.environ:
os.environ["TIGRIS_PROJECT"] = getpass.getpass("Tigris Project Name:")
if "TIGRIS_CLIENT_ID" not in os.environ:
os.environ["TIGRIS_CLIENT_ID"] = getpass.getpass("Tigris Client Id:")
if "TIGRIS_CLIENT_SECRET" not in os.environ:
os.environ["TIGRIS_CLIENT_SECRET"] = getpass.getpass("Tigris Client Secret:")
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import Tigris
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter

初始化 Tigris 向量存储

让我们导入测试数据集:

loader = TextLoader("../../../state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

embeddings = OpenAIEmbeddings()
vector_store = Tigris.from_documents(docs, embeddings, index_name="my_embeddings")
query = "What did the president say about Ketanji Brown Jackson"
found_docs = vector_store.similarity_search(query)
print(found_docs)

带分数的相似性搜索(向量距离)

query = "What did the president say about Ketanji Brown Jackson"
result = vector_store.similarity_search_with_score(query)
for doc, score in result:
print(f"document={doc}, score={score}")