使用docker和tensorflow/serving部署深度学习模型,参考https://blog.csdn.net/qq_35565669/article/details/106903787
实践的软件环境是Windows10+Anaconda3+TensorFlow1.13.1+Keras2.3.1
当里面只是简单地接受numpy数组,返回计算结果时,我们可以直接
query_data = '{"instances": [[1.0, 2.0, 3.0]}'
requests.post(url, query_data)
但是当输入是图片数据时,就不能这样操作了,因为模型predict接口接收的参数是float类型的,而不是string类型。具体看模型的输入、输出参数,可以使用saved_model_cli命令,代码如下。
(gpu) E:\tmp\tfserving\ur_seg\00000123>saved_model_cli show --dir . --all
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['input'] tensor_info:
dtype: DT_FLOAT
shape: (-1, -1, -1, 3)
name: data:0
The given SavedModel SignatureDef contains the following output(s):
outputs['softmax/truediv:0'] tensor_info:
dtype: DT_FLOAT
shape: (-1, -1, -1, 5)
name: softmax/truediv:0
Method name is: tensorflow/serving/predict
可以看到模型的签名是'serving default',输入层名称为'input',接收数据类型是DT_FLOAT的tensor,输出层的名称为'softmax/truediv:0',输出数据类型是DT_FLOAT的tensor。
启动容器,镜像里的gRPC端口是8500,这里是将source里本地的模型,挂载到target里的容器路径下面,并开始运行tfserving容器。
docker run -p 8500:8500 --name="ur_seg" --mount type=bind,source=E:/tmp/tfserving/ur_seg,target=/models/ur_seg -e MODEL_NAME=ur_seg -t tensorflow/serving "&"
这样,我们就需要输入整张图片的数据了。这里需要用到gRPC服务,需要先安装tensorflow-serving-api,注意这里要和tensorflow的版本一致,否则会出现兼容性问题,而且要使用pip安装。
pip install tensorflow-serving-api==1.13.1
请求服务的代码如下。
import keras.backend as K
import tensorflow as tf
import functions
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc
import grpc
import numpy as np
import os
import cv2 as cv
def request_server(img, server_url):
"""
用于向TensorFlow Serving服务请求推理结果的函数。
:param img: 经过预处理的待推理图片数组,numpy array,shape:(h, w, 3)
:param server_url: TensorFlow Serving的地址加端口,str,如:'127.0.0.1:8500'
:return: 模型返回的结果数组,numpy array
"""
channel = grpc.insecure_channel(server_url)
stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
request = predict_pb2.PredictRequest()
request.model_spec.name = "ur_seg" # 容器里的模型名称
request.model_spec.signature_name = "serving_default"
# "input"是模型输入层的名称
request.inputs["input"].CopyFrom(
tf.compat.v1.make_tensor_proto(img, shape=[1, ] + list(img.shape), dtype=tf.float32)) # 注意这里设置dtype为float类型
response = stub.Predict(request, 5.0)
print(response)
return np.asarray(response.outputs["softmax/truediv:0"].float_val)
if __name__ == '__main__':
img_path = r'E:\imgs\2.png'
img = cv.imread(img_path)
img = img.astype(np.float)
url = r'127.0.0.1:8500'
pred = request_server(img, url)
preds = functions.prediction_to_image(pred)
cv.imshow('', preds)
cv.waitKey(0)
参考链接里给的容器端口的地址都是0.0.0.0:8500,可是这个地址是unavailable的
File "E:/Brian/videoProcess/us_analysis/code/deploy.py", line 63, in request_server
response = stub.Predict(request, 10.0)
File "C:\Users\csai\Anaconda2\envs\gpu\lib\site-packages\grpc\_channel.py", line 826, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "C:\Users\csai\Anaconda2\envs\gpu\lib\site-packages\grpc\_channel.py", line 729, in _end_unary_response_blocking
raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses"
debug_error_string = "{"created":"@1593227968.373000000","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3941,"referenced_errors":[{"created":"@1593227968.373000000","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":393,"grpc_status":14}]}"
查了一堆信息,也没有解决,我换成了127.0.0.1:8500,才得到了正确结果(真的坑!)
查看了一下hosts文件
在服务器中,0.0.0.0
并不是一个真实的的IP地址,它表示本机中所有的IPV4地址。监听0.0.0.0
的端口,就是监听本机中所有IP的端口。
localhost其实是域名
,一般windows系统默认将localhost指向127.0.0.1
,但是localhost并不等于127.0.0.1
,localhost指向的IP地址是可以配置的。
参考:
http://dockone.io/article/9209
https://blog.csdn.net/weixin_34343000/article/details/88118667?utm_medium=distribute.pc_relevant.none-task-blog-baidujs-6
https://zhuanlan.zhihu.com/p/96917543
https://zhuanlan.zhihu.com/p/52096200
https://blog.csdn.net/liyi1009365545/article/details/84780476?utm_medium=distribute.pc_relevant_t0.none-task-blog-BlogCommendFromMachineLearnPai2-1.nonecase&depth_1-utm_source=distribute.pc_relevant_t0.none-task-blog-BlogCommendFromMachineLearnPai2-1.nonecase
https://blog.csdn.net/JerryZhang__/article/details/85107506?utm_medium=distribute.pc_relevant.none-task-blog-baidujs-1