安装docker
这里省略,参考其他docker安装的文章
安装tensorflow_serving docker
$docker pull tensorflow/serving:1.8.0
注意这里可以换成你自己想要的版本,具体有哪些版本可以去
https://hub.docker.com/r/tensorflow/serving/tags/去找。
除tensorflow版本不同外,存在4种镜像版本号,分为别:
:latest
: 带有编译好的Tensorflow Serving的最简docker镜像,无法进行任何修改,可直接部署:latest-gpu
: GPU版本的:latest:latest-devel
: devel是指development,可开启镜像容器bash修改配置,然后使用docker commit制作新镜像:latest-devel-gpu
: GPU版本的latest-devel$cd /root/software/
$git clone https://github.com/tensorflow/serving
在serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu
文件夹里面有源码中的一个训练好的例子模型
然后输入:
$docker run -p 8501:8501 --mount type=bind,source=/root/software/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu,target=/models/half_plus_two -e MODEL_NAME=half_plus_two -t tensorflow/serving:1.8.0
参数说明
-p 8501:8501
:意思是将主机的8501端口映射到docker容器的8501端口。需要说明的是,8500端口对于TensorFlow Serving提供的gRPC端口,8501为REST API服务端口。也就是说,你的模型部署,要使用gRPC接口,必须映射到8500端口,8501同理。--mount
: 表示要进行挂载,挂载方式是type=bind,也就是将容器的target目录挂载到宿主机的source的目录。同时source也是要运行部署的模型地址,也就是在宿主机上的模型目录。target一般是/models/dir(与source最后面相同,也对应着你模型的名字)。因为serving docker启动服务的时候,会去检测models目录下是否有你配置的模型目录-e MODEL_NAME=half_plus_two
:表示的是TensorFlow Serving需要加载的模型名称-t
表示的是要挂载到哪个容器上这样,我们就启动了docker 的container,对应这我们down下来的例子模型。下面测试一下是否正确
$curl -d '{"instances": [1.0, 2.0, 5.0]}' -X POST http://localhost:8501/v1/models/half_plus_two:predict
如果返回:
{ "predictions": [2.5, 3.0, 4.5] }
就说明模型的服务部署成功了,只要我们发送{"instances": [1.0, 2.0, 5.0]}
类似的请求到http://localhost:8501/v1/models/half_plus_two:predict
就可以得到最后的结果。
我们可以看到启动服务的命令有一个参数:
source=/root/software/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu
这里的目录就是模型的位置,cd到该目录下,可以看到里面是一个名为00000123的目录,这实际是模型的版本号。如果在这个文件夹下有多个版本目录怎么办?tensorflow serving会默认加载数字最大的那个,这样有利于我们模型的迭代更新。再进入到这个目录下可以看到一个如下两个文件(夹):
assets saved_model.pb, variables
variable目录下有如下两个文件:
variables.data-00000-of-00001, variables.index
这些文件的作用和代表的意义可以参考这篇文章http://d0evi1.com/tensorflow/serving/saved_model/
saved_model_cli show --dir /root/software/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu/00000123/ --all
然后输出是:
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['classify_x_to_y']:
The given SavedModel SignatureDef contains the following input(s):
inputs['inputs'] tensor_info:
dtype: DT_STRING
shape: unknown_rank
name: tf_example:0
The given SavedModel SignatureDef contains the following output(s):
outputs['scores'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: y:0
Method name is: tensorflow/serving/classify
signature_def['regress_x2_to_y3']:
The given SavedModel SignatureDef contains the following input(s):
inputs['inputs'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: x2:0
The given SavedModel SignatureDef contains the following output(s):
outputs['outputs'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: y3:0
Method name is: tensorflow/serving/regress
signature_def['regress_x_to_y']:
The given SavedModel SignatureDef contains the following input(s):
inputs['inputs'] tensor_info:
dtype: DT_STRING
shape: unknown_rank
name: tf_example:0
The given SavedModel SignatureDef contains the following output(s):
outputs['outputs'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: y:0
Method name is: tensorflow/serving/regress
signature_def['regress_x_to_y2']:
The given SavedModel SignatureDef contains the following input(s):
inputs['inputs'] tensor_info:
dtype: DT_STRING
shape: unknown_rank
name: tf_example:0
The given SavedModel SignatureDef contains the following output(s):
outputs['outputs'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: y2:0
Method name is: tensorflow/serving/regress
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['x'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: x:0
The given SavedModel SignatureDef contains the following output(s):
outputs['y'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: y:0
Method name is: tensorflow/serving/predict
在这里可以看到signature_def,inputs的名称,类型和输出,这些参数在接下来的模型预测请求中需要。
本文就写到这,其他的例如模型导出,请求发送,以及客户端编写,敬请期待…