Milvus 是一款开源的、针对海量特征向量的相似性搜索引擎。基于异构众核计算框架设计,成本更低,性能更好。 在有限的计算资源下,十亿向量搜索仅毫秒响应。
普遍用于图片搜索、智能问答、以商品搜商品等场景
Milvus是基于Docker部署的,你的Docker需要符合以下条件
- Docker 版本 > 19.03 部署docker
- Docker Compose 版本 > 1.25.1 安装Compose
CPU需要支持以下指令集中的任意一个
# 检查的命令
lscpu | grep -e sse4_2 -e avx -e avx2 -e avx512
wget https://raw.githubusercontent.com/milvus-io/milvus/master/deployments/docker/standalone/docker-compose.yml -O docker-compose.yml
docker-compose up -d
等待日志输出即完成了安装
Status: Downloaded newer image for milvusdb/milvus:v2.0.0-rc2-20210712-a8e5fd2
Creating milvus-etcd ... done
Creating milvus-minio ... done
Creating milvus-standalone ... done
通过命令确定单节点安装完成
PS:本文启动失败,暂未解决
原因:CPU不支持增强指令,虽然PC电脑支持,但通过Oracle VM VirtualBox安装的虚拟机并不能支持,所以导致启动失败
解决办法:在PC端安装Docker启动即可成功
[root@slave2 docker]# sudo docker-compose ps
Name Command State Ports
------------------------------------------------------------------------------------------------------------------------
milvus-etcd etcd -listen-peer-urls=htt ... Up (health: starting) 2379/tcp, 2380/tcp
milvus-minio /usr/bin/docker-entrypoint ... Up (health: starting) 9000/tcp
milvus-standalone /tini -- milvus run standalone Up 0.0.0.0:19530->19530/tcp,:::19530->19530/tc
p
关闭Milvus
sudo docker-compose down
启动Milvus
docker-compose up -d
官网文档
超级注意:如果你的python版本是2.7,请在升级到3.6+之前,先升级openssl版本
yum -y install openssl openssl-devel
注意: 需要python 3.6+版本(推荐版本3.6,避免出现错误) 升级参考
python 3.6+前置操作
# 1. 首先确认安装bz
yum install bzip2-devel
# 2. 其次确认openssl是最新版
yum -y install openssl openssl-devel
# 3. 没有GCC的请安装GCC
yum -y install gcc
pip安装依赖
pip3 install pymilvus-orm==2.0.0rc1
下载测试脚本
wget https://raw.githubusercontent.com/milvus-io/pymilvus-orm/v2.0.0rc1/examples/hello_milvus.py
脚本代码
# import package
from pymilvus_orm import *
def hello_milvus():
# create connection
connections.connect()
print(f"\nList collections...")
print(list_collections())
# create collection
dim = 128
default_fields = [
schema.FieldSchema(name="count", dtype=DataType.INT64, is_primary=True),
schema.FieldSchema(name="score", dtype=DataType.DOUBLE),
schema.FieldSchema(name="float_vector", dtype=DataType.FLOAT_VECTOR, dim=dim)
]
default_schema = schema.CollectionSchema(fields=default_fields, description="test collection")
print(f"\nCreate collection...")
collection = Collection(name="hello_milvus", data=None, schema=default_schema)
print(f"\nList collections...")
print(list_collections())
# insert data
import random
nb = 3000
vectors = [[random.random() for _ in range(dim)] for _ in range(nb)]
collection.insert([[i for i in range(nb)], [float(i) for i in range(nb)], vectors])
print(f"\nGet collection entities...")
print(collection.num_entities)
# create index and load table
default_index = {"index_type": "IVF_FLAT", "params": {"nlist": 128}, "metric_type": "L2"}
print(f"\nCreate index...")
collection.create_index(field_name="float_vector", index_params=default_index)
print(f"\nload collection...")
collection.load()
# load and search
topK = 5
search_params = {"metric_type": "L2", "params": {"nprobe": 10}}
import time
start_time = time.time()
print(f"\nSearch...")
res = collection.search(vectors[-2:], "float_vector", search_params, topK, "count > 100")
end_time = time.time()
# show result
for hits in res:
for hit in hits:
print(hit)
print("search latency = %.4fs" % (end_time - start_time))
# drop collection
collection.drop()
hello_milvus()
执行脚本
# 执行前别忘记启动Milvus引擎
python3 hello_milvus.py
如果需要使用Java版SDK,请安装1.1.0版本
因为SDK最高只支持1.1.0,不支持2.0版本
JavaSDK官网