Milvus基本概念(类比mysql等关系数据库):
数据库: default
表: collection
表结构: CollectionSchema
一条数据: entitie
如需安装Milvus,见下面第二点
from pymilvus import (
connections,
utility,
FieldSchema,
CollectionSchema,
DataType,
Collection,
)
# 使用默认数据库 ‘default’,也可以自己建数据库
connections.connect("default", host="localhost", port="19530")
创建Collection(表),需要先定义好表结构(即filelds)
①下面代码先建一个表结构,再封装为一个schema
② 再使用Collection建一个名为”hello_milvus“的集合表
fields = [
FieldSchema(name="pk", dtype=DataType.INT64, is_primary=True, auto_id=False),
FieldSchema(name="random", dtype=DataType.DOUBLE),
FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=8)
]
schema = CollectionSchema(fields, "hello_milvus is the simplest demo to introduce the APIs")
hello_milvus = Collection("hello_milvus", schema)
下面的entities 就一条数据,与上面的表结构相对应。
若上面表结构设置
import random
entities = [
[i for i in range(3000)], # field pk
[float(random.randrange(-20, -10)) for _ in range(3000)], # field random
[[random.random() for _ in range(8)] for _ in range(3000)], # field embeddings
]
insert_result = hello_milvus.insert(entities)
hello_milvus.flush()
只有使用索引,向量检索时才快,不然会使用暴力搜索方式。
索引选择可以参考 https://milvus.io/docs/index.md
index = {
"index_type": "IVF_FLAT",
"metric_type": "L2",
"params": {"nlist": 128},
}
hello_milvus.create_index("embeddings", index)
hello_milvus.load()
vectors_to_search =[[random.random() for _ in range(8)]]
search_params = {
"metric_type": "L2",
"params": {"nprobe": 10},
}
result = hello_milvus.search(vectors_to_search, "embeddings", search_params, limit=3, output_fields=["random"])
result = hello_milvus.search(vectors_to_search, "embeddings", search_params, limit=3, expr="random > -12", output_fields=["random"])
for hits in result:
for hit in hits:
print(f"hit: {hit}, random field: {hit.entity.get('random')}")
hit: (distance: 0.07080483436584473, id: 2060), random field: -12.0
hit: (distance: 0.0729561522603035, id: 1647), random field: 15.0
hit: (distance: 0.08006095886230469, id: 2530), random field: -15.0
search latency = 0.0080s
若需要删除表中数据,可以指定删除条件,
下面代码表示删除第一个,和第二个数据
expr = f"pk in [{entities[0]}, {entities[1]}]"
hello_milvus.delete(expr)
如需要删除数据库中的表,使用以下命令:
utility.drop_collection("hello_milvus")
使用docker安装standalone 模式Milvus集群
wget https://github.com/milvus-io/milvus/releases/download/v2.2.11/milvus-standalone-docker-compose.yml -O docker-compose.yml
sudo docker-compose up -d
sudo docker-compose ps
docker port milvus-standalone 19530/tcp
sudo docker-compose down