前言:Tensorflow Serving Quick Run
详细步骤
安装/升级 Docker 到最新版本
前言
步骤
问题
安装/升级 Nvidia-Docker 2.0
前言
步骤
问题
Install / Run a GPU serving image
前言
步骤
实际项目部署
Server 端
Client 端
参考
# Download the TensorFlow Serving Docker image and repo
docker pull tensorflow/serving
git clone https://github.com/tensorflow/serving
# Location of demo models
TESTDATA="$(pwd)/serving/tensorflow_serving/servables/tensorflow/testdata"
# Start TensorFlow Serving container and open the REST API port
docker run -t --rm -p 8501:8501 \
-v "$TESTDATA/saved_model_half_plus_two_cpu:/models/half_plus_two" \
-e MODEL_NAME=half_plus_two \
tensorflow/serving &
# Query the model using the predict API
curl -d '{"instances": [1.0, 2.0, 5.0]}' \
-X POST http://localhost:8501/v1/models/half_plus_two:predict
# Returns => { "predictions": [2.5, 3.0, 4.5] }
官方有很多的 Docker 安装方法,这次我们用其推荐的 Install from the repository 方法。
1. 找不到 docker daemon
~$ docker run hello-world
docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?. See 'docker run --help'.
原因: docker 服务未启动。启动 docker 服务:
~$ service docker start
或者设置 docker 开机自启动:
systemctl enable docker # 开机自动启动docker
systemctl start docker # 启动docker
systemctl restart docker # 重启dokcer
看好prerequisites,没有什么特别需要注意的。
docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi
目前还没有遇到问题。
在拉取 Docker Image 的时候,注意区分 CPU 版本和 GPU 版本
docker pull tensorflow/serving
docker pull tensorflow/serving:latest-gpu
拉取 Docker Image
docker pull tensorflow/serving:latest-gpu
拉取 toy model
mkdir -p /tmp/tfserving
cd /tmp/tfserving
git clone https://github.com/tensorflow/serving
部署 toy model 到 Tensorflow Server 上
docker run --runtime=nvidia -p 8501:8501 \
--mount type=bind,source=/media/kent/DISK2/tfserving/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_gpu,target=/models/half_plus_two \
-e MODEL_NAME=half_plus_two -t tensorflow/serving:latest-gpu &
发送预测请求
curl -d '{"instances": [1.0, 2.0, 5.0]}' \
-X POST http://localhost:8501/v1/models/half_plus_two:predict
Command | Description |
---|---|
docker container inspect |
Display detailed information on one or more containers |
docker ps -aq | 列出所有容器ID |
docker ps -a | 查看所有运行或者不运行容器 |
docker stop $(docker ps -aq) | 停止所有的container(容器) |
docker rm $(docker ps -aq) | 删除所有container(容器) |
docker container logs | Fetch the logs of a container |
docker container ls | 列出所有container(容器)的详细信息 |
docker image ls | 列出所有image(镜像)的详细信息 |
前面的介绍是试验了一个toy model。它并没有涉及到如何将训练好的模型保存为Server可用的模型。这一部分介绍一下如何部署自己训练好的模型,并通过Client端发送预测请求。
1. 模型上线
#!/usr/bin/env bash
# Start the Docker image of the Tensorflow Server
SERVER_MDL_PATH=./path/to/exported/model/root/
MODEL_NAME=model_name
# Note: for the port parameter it should be in format like: -p local_ip:local_port:docker_port
# - Option 1: Run Model on GPU:
docker run --runtime=nvidia \
-p 10.0.0.1:5501:8500 \
--mount type=bind,source=${SERVER_MDL_PATH},target=/models/${MODEL_NAME} \
-e MODEL_NAME=${MODEL_NAME} -t tensorflow/serving:latest-gpu &
# - Option 2: Run Model on CPU:
docker run --runtime=nvidia \
-p 10.0.0.1:5501:8500 \
--mount type=bind,source=${SERVER_MDL_PATH},target=/models/${MODEL_NAME} \
-e MODEL_NAME=${MODEL_NAME} -t tensorflow/serving:latest &
2. 获取 Server IP
# 2. Get the server IP address
# (Note: this is not straightly applied to everyone, you might need more steps to
# set up your router first. Example tutorial: https://www.youtube.com/watch?v=7gVHVERECu4)
curl ifconfig.me
1. docker 在 GPU 上部署模型的时候无法部署 int32 类型的 op
报错:
Invalid argument: Cannot assign a device for operation Variable: Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
同时会看到 int32 的类似字眼。
办法1:
在导出模型模型的时候,在所有 with tf.device(device_str): 的地方将 device_str 设置为 device_str = '/cpu:0'。
办法2:
在训练和生成模型的时候,把所有 int32 的 op 和 placeholder 都改为 int64。
1. 安装环境和依赖包
# _____ Prerequisites _____
# 1. install Python 2.7 (Python 3 for Windows 7 or later)
# 2. install virtualenv
# 3. activate the virtualenv
# (you can find the commands of the above steps @ https://www.tensorflow.org/install/pip)
# 4. pip install tensorflow-serving-api
# (NOTE: this will also install the tensorflow-cpu module.
# If you already have tensorflow-gpu on your computer, create a new virtualenv to install this package,
# do NOT install the tensorflow-serving-api and tensorflow-gpu in the same virtualenv.)
# (NOTE: if you want to install a specific version (like v1.5 for old machine),
# use command like
# pip install 'tensorflow-serving-api-python3~=1.5.0'
# for more commands, see https://pypi.org/project/tensorflow-serving-api-python3/)
2. 修改下面的 script 以适应你的 server model (修改 # todo 的地方)
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time : 1/12/2018
# @Author : Fangliang Bai
# @Software: PyCharm Professional
# _____ Prerequisites _____
# 1. install Python 2.7 (Python 3 for Windows 7 or later)
# 2. install virtualenv
# 3. activate the virtualenv
# (you can find the commands of the above steps @ https://www.tensorflow.org/install/pip)
# 4. pip install tensorflow-serving-api
import json
import numpy as np
from grpc.beta import implementations
import tensorflow as tf
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc, prediction_service_pb2
# 1. Create gRPC stub
server_ip = '0.0.0.0' # Running gRPC ModelServer at 0.0.0.0:8500. # todo
server_port = int(8500) # gRPC port (default value, no need to change)
channel = implementations.insecure_channel(server_ip, server_port)
stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
# 2. Initial request variables
INPUT_WIDE_KEY = 'x_wide' # input tensor1 name of the network # todo
INPUT_DEEP_KEY = 'x_deep' # input tensor2 name of the network # todo
OUTPUT_KEY = 'output' # output tensor name of the network # todo
WIDE_DIM = 10 # todo
DEEP_DIM = 10 # todo
x_wide_data = np.random.rand(100).reshape(-1, 10) # input data1 # todo
x_deep_data = np.random.rand(100).reshape(-1, 10) # input data2 # todo
# 3. Initial request
request = predict_pb2.PredictRequest()
request.model_spec.name = 'model_name' # the model_name is the name registered with docker image. # todo
request.model_spec.signature_name = tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY
request.inputs[INPUT_WIDE_KEY].CopyFrom(
tf.contrib.util.make_tensor_proto(x_wide_data, shape=[10, WIDE_DIM], dtype=tf.float32))
request.inputs[INPUT_DEEP_KEY].CopyFrom(
tf.contrib.util.make_tensor_proto(x_deep_data, shape=[10, DEEP_DIM], dtype=tf.float32))
# 4. Send request
res = stub.Predict(request, 10.0) # 10s timeout
print(res.outputs[OUTPUT_KEY])
# # Request method 2 (NOT verified)
# # _____ NOTE _____ The jason has serialization problem when dealing with numpy array data. So need to convert
# # numpy array to list using the following class
#
#
# class NumpyEncoder(json.JSONEncoder):
# def default(self, obj):
# if isinstance(obj, np.ndarray):
# return obj.tolist()
# return json.JSONEncoder.default(self, obj)
#
#
# data1 = {"keep_prob": 1.0, "input_x": x_test[0]}
# data2 = {"keep_prob": 1.0, "input_x": x_test[1]}
# data3 = {"keep_prob": 1.0, "input_x": x_test[2]}
# param = {"instances": [data1, data2, data3]}
# param = json.dumps(param, cls=NumpyEncoder)
# res = request.post('http://localhost:8501/v1/models/find_lemma_category:predict', data=param)
关于外网 prediction 请求
思路一:重定向
1. [重置]https://wiki.archlinux.org/index.php/Iptables_(%E7%AE%80%E4%BD%93%E4%B8%AD%E6%96%87)#%E9%87%8D%E7%BD%AE%E8%A7%84%E5%88%99
2. [重置] https://www.cnblogs.com/hongchenok/p/3577354.html
3. [设置] http://www.voidcn.com/article/p-mgnvivhg-gr.html
4. [设置] https://stackoverrun.com/cn/q/3981105
5. [设置] https://www.oschina.net/question/141942_2237299
6. [设置] https://blog.csdn.net/javaee_ssh/article/details/22167149