TF Serving架构具体包括:servables、loaders、sources、managers和core,具体内容参照tensorflow serving官方教程。
本文主要讲如何通过docker进行模型的部署,官方教程看这里。
docker pull tensorflow/serving:1.13.0
docker image ls | grep tensorflow
输出内容大致如下(我有两个tensorflow/serving的镜像)
tensorflow/serving 1.13.0 38bee21b2ca0 3 months ago 229MB
tensorflow/serving latest 38bee21b2ca0 3 months ago 229MB
假设我有一个模型test1
,放在host目录/tmp/models/test1,映射到docker的container环境里的/models/test1,文件树:
test1
└── 1
├── saved_model.pb
└── variables
├── variables.data-00000-of-00001
└── variables.index
直接加载最新的模型(不推荐,只能加载版本最大的模型)
docker run -t --rm -p 8500:8500 -p 8501:8501 \
-v /tmp/models/test1:/models/test1 -e MODEL_NAME=test1 tensorflow/serving &
或者
docker run -t --rm -p 8500:8500 -p 8501:8501 \
--mount type=bind,source=/tmp/models/test1,target=/models/test1 \
-e MODEL_NAME=test1 tensorflow/serving &
其中端口8500
是用于通过gRPC通信,端口 8501
通过REST方式通信,参数MODEL_NAEM
是通过上述两种方式进行通信时模型的名字。
假设我有一个模型test1
,放在host目录/tmp/models/test1,映射到docker的container环境里的/models/test1,里面有两个版本1和2,文件树:
test1
├── 1
│ ├── saved_model.pb
│ └── variables
│ ├── variables.data-00000-of-00001
│ └── variables.index
└── 2
├── saved_model.pb
└── variables
├── variables.data-00000-of-00001
└── variables.index
这种方式能够同时部署某个模型的多个版本,同时能够通过热更新的方式对上线的模型进行更新。
/tmp/models/test1/test1_model.config
,内容如下:model_config_list {
config {
name: "test1",
base_path: "/models/test1"
model_platform: "tensorflow"
model_version_policy{
specific {
versions: 1
versions: 2
}
}
}
}
参数说明:name
相当于第一种方式中的MODEL_NAME
base_path
是在tensorflow/serving的docker容器中路径model_version_policy
说明我们要加载的模型版本,比如当前配置加载版本1和版本2。docker run -t --rm -p 8500:8500 -p 8501:8501 \
-v /tmp/models/test1:/models/test1 tensorflow/serving --model_config_file=/models/test1/test1_model.config
2019-05-31 05:49:55.601881: I tensorflow_serving/model_servers/server_core.cc:558] (Re-)adding model: test1
2019-05-31 05:49:55.703037: I tensorflow_serving/core/basic_manager.cc:739] Successfully reserved resources to load servable {name: test1 version: 2}
2019-05-31 05:49:55.703270: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: test1 version: 2}
2019-05-31 05:49:55.703325: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: test1 version: 2}
2019-05-31 05:49:55.703386: I external/org_tensorflow/tensorflow/contrib/session_bundle/bundle_shim.cc:363] Attempting to load native SavedModelBundle in bundle-shim from: /models/test1/2
2019-05-31 05:49:55.703473: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /models/test1/2
2019-05-31 05:49:55.708316: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2019-05-31 05:49:55.717364: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-05-31 05:49:55.746743: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:182] Restoring SavedModel bundle.
2019-05-31 05:49:55.767047: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:285] SavedModel load for tags { serve }; Status: success. Took 63566 microseconds.
2019-05-31 05:49:55.767141: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:101] No warmup data file found at /models/test1/2/assets.extra/tf_serving_warmup_requests
2019-05-31 05:49:55.767513: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: test1 version: 2}
2019-05-31 05:49:55.769139: I tensorflow_serving/model_servers/server.cc:313] Running gRPC ModelServer at 0.0.0.0:8500 ...
[warn] getaddrinfo: address family for nodename not supported
[evhttp_server.cc : 237] RAW: Entering the event loop ...
2019-05-31 05:49:55.770267: I tensorflow_serving/model_servers/server.cc:333] Exporting HTTP/REST API at:localhost:8501 ...
2019-05-31 05:49:55.802771: I tensorflow_serving/core/basic_manager.cc:739] Successfully reserved resources to load servable {name: test1 version: 1}
2019-05-31 05:49:55.802865: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: test1 version: 1}
2019-05-31 05:49:55.802881: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: test1 version: 1}
2019-05-31 05:49:55.802900: I external/org_tensorflow/tensorflow/contrib/session_bundle/bundle_shim.cc:363] Attempting to load native SavedModelBundle in bundle-shim from: /models/test1/1
2019-05-31 05:49:55.802915: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /models/test1/1
2019-05-31 05:49:55.804155: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2019-05-31 05:49:55.807801: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:182] Restoring SavedModel bundle.
2019-05-31 05:49:55.823720: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:285] SavedModel load for tags { serve }; Status: success. Took 20777 microseconds.
2019-05-31 05:49:55.823806: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:101] No warmup data file found at /models/test1/1/assets.extra/tf_serving_warmup_requests
2019-05-31 05:49:55.823962: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: test1 version: 1}
version_labels{
key: "stable"
value: 1
}
只能通过热更新的方式进行操作,下面是通过gRPC方式更新运行中的model_config_file:
import grpc
from google.protobuf import text_format
from tensorflow_serving.apis import model_management_pb2
from tensorflow_serving.apis import model_service_pb2_grpc
from tensorflow_serving.config import model_server_config_pb2
def update(config_file, host_port):
config_file = "model.config"
host_port = "localhost:8500"
channel = grpc.insecure_channel(host_port)
stub = model_service_pb2_grpc.ModelServiceStub(channel)
request = model_management_pb2.ReloadConfigRequest()
# read config file
config_content = open(config_file, "r").read()
model_server_config = model_server_config_pb2.ModelServerConfig()
model_server_config = text_format.Parse(text=config_content, message=model_server_config)
model_configs = model_server_config.model_config_list.config
# modify origin config message
for config in model_configs:
config.model_version_policy.specific.versions.extend([3,4])
# assign value to version_labels k-v pair
config.version_labels["stable"] = 3
config.version_labels["canary"] = 4
# create a new one config
config_list = model_server_config_pb2.ModelConfigList()
new_config = config_list.config.add()
new_config.name="test2"
new_config.base_path="/models/test2"
new_config.model_platform="tensorflow"
# assign value to model_version_policy filed
new_config.model_version_policy.specific.versions.extend([3,4])
# add to origin config message
model_server_config.model_config_list.MergeFrom(config_list)
request.config.CopyFrom(model_server_config)
request_response = stub.HandleReloadConfigRequest(request, 10)
if request_response.status.error_code == 0:
open(config_file, "w").write(str(request.config))
print("TF Serving config file updated.")
else:
print("Failed to update config file.")
print(request_response.status.error_code)
print(request_response.status.error_message)
参考:
https://blog.csdn.net/hahajinbu/article/details/81945149
https://stackoverflow.com/questions/54440762/tensorflow-serving-update-model-config-add-additional-models-at-runtime