本文主要介绍如何基于Docker的TensorFlow Serving快速部署训练好的模型,以对外提供服务。部署在线服务(Serving)官方推荐使用 SavedModel 格式,而部署到手机等移动端的模型一般使用 FrozenGraphDef 格式。
本文训练一个神经网络模型来分类衣服的图像,衣服类别有比如运动鞋和衬衫等,并用 TensorFlow Serving 将其部署到线上。
导入依赖:
# TensorFlow and tf.keras
import tensorflow as tf
from tensorflow import keras
# Helper libraries
import numpy as np
import matplotlib.pyplot as plt
import os
import subprocess
tf.logging.set_verbosity(tf.logging.ERROR)
print(tf.__version__)
本文采用Fashion MNIST dataset。该数据集有70,000张灰度图像,分类类别为10,每张图像分辨率是28 * 28 pixels。数据如下图所示:
加载数据:
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
# scale the values to 0.0 to 1.0
train_images = train_images / 255.0
test_images = test_images / 255.0
# reshape for feeding into the model
train_images = train_images.reshape(train_images.shape[0], 28, 28, 1)
test_images = test_images.reshape(test_images.shape[0], 28, 28, 1)
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
print('\ntrain_images.shape: {}, of {}'.format(train_images.shape, train_images.dtype))
print('test_images.shape: {}, of {}'.format(test_images.shape, test_images.dtype))
运行结果:
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
32768/29515 [=================================] - 0s 0us/step
40960/29515 [=========================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
26427392/26421880 [==============================] - 0s 0us/step
26435584/26421880 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
16384/5148 [===============================================================================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz
4423680/4422102 [==============================] - 0s 0us/step
4431872/4422102 [==============================] - 0s 0us/step
train_images.shape: (60000, 28, 28, 1), of float64
test_images.shape: (10000, 28, 28, 1), of float64
model = keras.Sequential([
keras.layers.Conv2D(input_shape=(28,28,1), filters=8, kernel_size=3,
strides=2, activation='relu', name='Conv1'),
keras.layers.Flatten(),
keras.layers.Dense(10, activation=tf.nn.softmax, name='Softmax')
])
model.summary()
testing = False
epochs = 5
model.compile(optimizer=tf.train.AdamOptimizer(),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=epochs)
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('\nTest accuracy: {}'.format(test_acc))
运行结果:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
Conv1 (Conv2D) (None, 13, 13, 8) 80
_________________________________________________________________
flatten (Flatten) (None, 1352) 0
_________________________________________________________________
Softmax (Dense) (None, 10) 13530
=================================================================
Total params: 13,610
Trainable params: 13,610
Non-trainable params: 0
_________________________________________________________________
Epoch 1/5
60000/60000 [==============================] - 8s 140us/sample - loss: 0.5265 - acc: 0.8204
Epoch 2/5
60000/60000 [==============================] - 6s 96us/sample - loss: 0.3753 - acc: 0.8688
Epoch 3/5
60000/60000 [==============================] - 6s 94us/sample - loss: 0.3423 - acc: 0.8788
Epoch 4/5
60000/60000 [==============================] - 6s 94us/sample - loss: 0.3207 - acc: 0.8856
Epoch 5/5
60000/60000 [==============================] - 6s 94us/sample - loss: 0.3069 - acc: 0.8906
10000/10000 [==============================] - 1s 70us/sample - loss: 0.3464 - acc: 0.8772
Test accuracy: 0.877200007439
为了在TensorFlow Serving中加载已经训练的模型,需要将训练的模型以SaveModel格式进行保存。
# Fetch the Keras session and save the model
# The signature definition is defined by the input and output tensors,
# and stored with the default serving key
import tempfile
MODEL_DIR = tempfile.gettempdir()
version = 1
export_path = os.path.join(MODEL_DIR, str(version))
print('export_path = {}\n'.format(export_path))
if os.path.isdir(export_path):
print('\nAlready saved a model, cleaning up\n')
!rm -r {export_path}
tf.saved_model.simple_save(
keras.backend.get_session(),
export_path,
inputs={'input_image': model.input},
outputs={t.name:t for t in model.outputs})
print('\nSaved model:')
!ls -l {export_path}
模型保存的位置是/tmp/1
。该目录下有:saved_model.pb variables
其中variables
的目录有以下2个文件:
variables.data-00000-of-00001 variables.index
使用saved_model_cli
命令来检测SaveModel中的MetaGraphDefs
和SignatureDefs
。SavedModel包含一个或多个MetaGraphDef
,由其标签集进行标识。要提供模型,如果想知道每个模型中的 SignatureDef 是什么类型的,它们的输入和输出是什么。可以通过show 命令,按层次顺序检查 SavedModel 的内容。
saved_model_cli show --dir {export_path} --all
检查结果:
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['input_image'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 28, 28, 1)
name: Conv1_input:0
The given SavedModel SignatureDef contains the following output(s):
outputs['Softmax/Softmax:0'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 10)
name: Softmax/Softmax:0
Method name is: tensorflow/serving/predict
最简单的安装方式是采用docker。本文这里采用原生的安装方式。
先更新apt源:
# This is the same as you would do from your command line, but without the [arch=amd64], and no sudo
# You would instead do:
# echo "deb [arch=amd64] http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | sudo tee /etc/apt/sources.list.d/tensorflow-serving.list && \
# curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | sudo apt-key add -
echo "deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | tee /etc/apt/sources.list.d/tensorflow-serving.list && \
curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | apt-key add -
!apt update
更新过程信息:
deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2943 100 2943 0 0 7827 0 --:--:-- --:--:-- --:--:-- 7827
OK
Get:1 http://storage.googleapis.com/tensorflow-serving-apt stable InRelease [3,012 B]
Get:2 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ InRelease [3,626 B]
Ign:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 InRelease
Get:4 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ Packages [70.5 kB]
Get:5 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server-universal amd64 Packages [365 B]
Ign:6 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 InRelease
Get:7 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 Release [564 B]
Get:8 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 Release [564 B]
Get:9 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 Release.gpg [819 B]
Get:10 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 Release.gpg [833 B]
Get:11 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic InRelease [21.3 kB]
Get:12 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Get:13 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server amd64 Packages [357 B]
Get:14 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 Packages [113 kB]
Hit:15 http://archive.ubuntu.com/ubuntu bionic InRelease
Get:16 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 Packages [19.8 kB]
Get:17 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
Get:18 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic InRelease [15.4 kB]
Get:19 http://security.ubuntu.com/ubuntu bionic-security/restricted amd64 Packages [9,585 B]
Get:20 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic/main amd64 Packages [31.7 kB]
Get:21 http://security.ubuntu.com/ubuntu bionic-security/universe amd64 Packages [769 kB]
Get:22 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]
Get:23 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic/main Sources [1,686 kB]
Get:24 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages [959 kB]
Get:25 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [662 kB]
Get:26 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic/main amd64 Packages [810 kB]
Get:27 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages [1,288 kB]
Get:28 http://security.ubuntu.com/ubuntu bionic-security/multiverse amd64 Packages [5,230 B]
Get:29 http://archive.ubuntu.com/ubuntu bionic-updates/multiverse amd64 Packages [8,284 B]
Get:30 http://archive.ubuntu.com/ubuntu bionic-updates/restricted amd64 Packages [20.3 kB]
Get:31 http://archive.ubuntu.com/ubuntu bionic-backports/universe amd64 Packages [4,227 B]
Fetched 6,755 kB in 9s (735 kB/s)
Reading package lists... Done
Building dependency tree
Reading state information... Done
107 packages can be upgraded. Run 'apt list --upgradable' to see them.
这里采用apt的安装方式:
apt-get install tensorflow-model-server
安装过程信息:
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
tensorflow-model-server
0 upgraded, 1 newly installed, 0 to remove and 107 not upgraded.
Need to get 151 MB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server amd64 tensorflow-model-server all 1.14.0 [151 MB]
Fetched 151 MB in 3s (46.3 MB/s)
Selecting previously unselected package tensorflow-model-server.
(Reading database ... 131183 files and directories currently installed.)
Preparing to unpack .../tensorflow-model-server_1.14.0_all.deb ...
Unpacking tensorflow-model-server (1.14.0) ...
Setting up tensorflow-model-server (1.14.0) ...
以下采用REST方式启动(另一种是gRPC)。
%%bash --bg
nohup tensorflow_model_server \
--rest_api_port=8501 \
--model_name=fashion_model \
--model_base_path="${MODEL_DIR}" >server.log 2>&1
其中bg
是 将进程搬到后台运行(Background)
查看日志信息:tail server.log
日志信息详情:
2019-09-19 02:18:22.039966: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2019-09-19 02:18:22.041055: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-09-19 02:18:22.054399: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:202] Restoring SavedModel bundle.
2019-09-19 02:18:22.066228: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:311] SavedModel load for tags { serve }; Status: success. Took 27522 microseconds.
2019-09-19 02:18:22.066280: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:103] No warmup data file found at /tmp/1/assets.extra/tf_serving_warmup_requests
2019-09-19 02:18:22.066354: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: fashion_model version: 1}
2019-09-19 02:18:22.067453: I tensorflow_serving/model_servers/server.cc:324] Running gRPC ModelServer at 0.0.0.0:8500 ...
[warn] getaddrinfo: address family for nodename not supported
2019-09-19 02:18:22.068189: I tensorflow_serving/model_servers/server.cc:344] Exporting HTTP/REST API at:localhost:8501 ...
[evhttp_server.cc : 239] RAW: Entering the event loop ...
(1)查看待识别的数据:
def show(idx, title):
plt.figure()
plt.imshow(test_images[idx].reshape(28,28))
plt.axis('off')
plt.title('\n\n{}'.format(title), fontdict={'size': 16})
import random
rando = random.randint(0,len(test_images)-1)
show(rando, 'An Example Image: {}'.format(class_names[test_labels[rando]]))
(2)数据封装
将上述图像进行封装,下述封装3张图像:
import json
data = json.dumps({"signature_name": "serving_default", "instances": test_images[0:3].tolist()})
print('Data: {} ... {}'.format(data[:50], data[len(data)-52:]))
运行结果:
Data: {"instances": [[[[0.0], [0.0], [0.0], [0.0], [0.0] ... 0.0], [0.0]]]], "signature_name": "serving_default"}
(3)创建REST请求
如果没有安装requests,则先安装pip install -q requests
。
以POST方式向服务方的REST端发送请求。如果没有指定特殊的可服务版本,默认向最新的版本请求。
import requests
headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/fashion_model:predict', data=data, headers=headers)
predictions = json.loads(json_response.text)['predictions']
show(0, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format(
class_names[np.argmax(predictions[0])], test_labels[0], class_names[np.argmax(predictions[0])], test_labels[0]))
注意:这里URI中的v1
是指版本号,由于本文的模型其实只有一个版本,所以就只能是v1,当存在其他版本模型时候是可以直接修改的。
URI中的models
是固定的,fashion_model
是之前启动tensorflow_model_server
时候指定--model_name
参数。
返回结果:
让我们指定servable的一个特定版本。由于只有一个版本,我们选择version 1。我们还将查看所有这三个结果。
headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/fashion_model/versions/1:predict', data=data, headers=headers)
predictions = json.loads(json_response.text)['predictions']
for i in range(0,3):
show(i, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format(
class_names[np.argmax(predictions[i])], test_labels[i], class_names[np.argmax(predictions[i])], test_labels[i]))