Ubuntu xinference部署本地模型bge-large-zh-v1.5、bge-reranker-v2-m3

bge-large-zh-v1.5

下载模型到指定路径:

modelscope download --model BAAI/bge-large-zh-v1.5 --local_dir ./bge-large-zh-v1.5

自定义 embedding 模型,custom-bge-large-zh-v1.5.json:

{
    "model_name": "custom-bge-large-zh-v1.5",
    "dimensions": 1024,
    "max_tokens": 512,
    "language": ["zh"],
    "model_id": "BAAI/bge-large-zh-v1.5",
    "model_uri": "/path/to/bge-large-zh-v1.5"
}

注册自定义模型:

xinference register --model-type embedding --file custom-bge-large-zh-v1.5.json --persist

启动自定义模型:

xinference launch --model-name custom-bge-large-zh-v1.5 --model-type embedding

bge-reranker-v2-m3

下载模型到指定路径:

 modelscope download --model AI-ModelScope/bge-reranker-v2-m3 --local_dir ./bge-reranker-v2-m3

自定义 rerank 模型custom-bge-reranker-v2-m3.json

{
    "model_name": "custom-bge-reranker-v2-m3",
    "type": "normal",
    "language": ["en", "zh", "multilingual"],
    "model_id": "BAAI/bge-reranker-v2-m3",
    "model_uri": "/path/to/bge-reranker-v2-m3"
}

注册自定义模型:

xinference register --model-type rerank --file ./custom-bge-reranker-v2-m3.json --persist

出现错误:

Traceback (most recent call last):
  File "//env/bin/xinference", line 8, in <module>
    sys.exit(cli())
  File "//env/lib/python3.10/site-packages/click/core.py", line 1161, in __call__
    return self.main(*args, **kwargs)
  File "//env/lib/python3.10/site-packages/click/core.py", line 1082, in main
    rv = self.invoke(ctx)
  File "//env/lib/python3.10/site-packages/click/core.py", line 1697, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "//env/lib/python3.10/site-packages/click/core.py", line 1443, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "//env/lib/python3.10/site-packages/click/core.py", line 788, in invoke
    return __callback(*args, **kwargs)
  File "//env/lib/python3.10/site-packages/xinference/deploy/cmdline.py", line 407, in register_model
    client.register_model(
  File "//env/lib/python3.10/site-packages/xinference/client/restful/restful_client.py", line 1188, in register_model
    raise RuntimeError(
RuntimeError: Failed to register model, detail: Not Found

成功(因为xinference部署在9999端口):

xinference register --endpoint http://localhost:9999 --model-type rerank --file ./custom-bge-reranker-v2-m3.json --persist

启动自定义模型:

xinference launch --model-type rerank --model-name custom-bge-reranker-v2-m3 --endpoint http://localhost:9999

验证模型加载成功,输出中会显示已加载的模型。

curl http://localhost:9999/v1/models
{"object":"list","data":[{"id":"custom-bge-large-zh-v1.5","object":"model","created":0,"owned_by":"xinference","model_type":"embedding","address":"0.0.0.0:39987","accelerators":[],"model_name":"custom-bge-large-zh-v1.5","dimensions":1024,"max_tokens":512,"language":["zh"],"model_revision":null,"replica":1},{"id":"custom-bge-reranker-v2-m3","object":"model","created":0,"owned_by":"xinference","model_type":"rerank","address":"0.0.0.0:44611","accelerators":[],"type":"normal","model_name":"custom-bge-reranker-v2-m3","language":["en","zh","multilingual"],"model_revision":null,"replica":1}]}(env) 

你可能感兴趣的:(其他,ubuntu)