TensorFlow服务部署-以图像分类为例

背景

本文主要介绍如何基于Docker的TensorFlow Serving快速部署训练好的模型,以对外提供服务。部署在线服务(Serving)官方推荐使用 SavedModel 格式,而部署到手机等移动端的模型一般使用 FrozenGraphDef 格式。

本文训练一个神经网络模型来分类衣服的图像,衣服类别有比如运动鞋和衬衫等,并用 TensorFlow Serving 将其部署到线上。

模型训练

导入依赖:

# TensorFlow and tf.keras
import tensorflow as tf
from tensorflow import keras

# Helper libraries
import numpy as np
import matplotlib.pyplot as plt
import os
import subprocess

tf.logging.set_verbosity(tf.logging.ERROR)
print(tf.__version__)

导入数据

本文采用Fashion MNIST dataset。该数据集有70,000张灰度图像,分类类别为10,每张图像分辨率是28 * 28 pixels。数据如下图所示:

加载数据:

fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

# scale the values to 0.0 to 1.0
train_images = train_images / 255.0
test_images = test_images / 255.0

# reshape for feeding into the model
train_images = train_images.reshape(train_images.shape[0], 28, 28, 1)
test_images = test_images.reshape(test_images.shape[0], 28, 28, 1)

class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

print('\ntrain_images.shape: {}, of {}'.format(train_images.shape, train_images.dtype))
print('test_images.shape: {}, of {}'.format(test_images.shape, test_images.dtype))

运行结果:

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
32768/29515 [=================================] - 0s 0us/step
40960/29515 [=========================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
26427392/26421880 [==============================] - 0s 0us/step
26435584/26421880 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
16384/5148 [===============================================================================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz
4423680/4422102 [==============================] - 0s 0us/step
4431872/4422102 [==============================] - 0s 0us/step

train_images.shape: (60000, 28, 28, 1), of float64
test_images.shape: (10000, 28, 28, 1), of float64

训练模型

model = keras.Sequential([
  keras.layers.Conv2D(input_shape=(28,28,1), filters=8, kernel_size=3, 
                      strides=2, activation='relu', name='Conv1'),
  keras.layers.Flatten(),
  keras.layers.Dense(10, activation=tf.nn.softmax, name='Softmax')
])
model.summary()

testing = False
epochs = 5

model.compile(optimizer=tf.train.AdamOptimizer(), 
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=epochs)

test_loss, test_acc = model.evaluate(test_images, test_labels)
print('\nTest accuracy: {}'.format(test_acc))

运行结果:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
Conv1 (Conv2D)               (None, 13, 13, 8)         80        
_________________________________________________________________
flatten (Flatten)            (None, 1352)              0         
_________________________________________________________________
Softmax (Dense)              (None, 10)                13530     
=================================================================
Total params: 13,610
Trainable params: 13,610
Non-trainable params: 0
_________________________________________________________________
Epoch 1/5
60000/60000 [==============================] - 8s 140us/sample - loss: 0.5265 - acc: 0.8204
Epoch 2/5
60000/60000 [==============================] - 6s 96us/sample - loss: 0.3753 - acc: 0.8688
Epoch 3/5
60000/60000 [==============================] - 6s 94us/sample - loss: 0.3423 - acc: 0.8788
Epoch 4/5
60000/60000 [==============================] - 6s 94us/sample - loss: 0.3207 - acc: 0.8856
Epoch 5/5
60000/60000 [==============================] - 6s 94us/sample - loss: 0.3069 - acc: 0.8906
10000/10000 [==============================] - 1s 70us/sample - loss: 0.3464 - acc: 0.8772

Test accuracy: 0.877200007439

保存模型

为了在TensorFlow Serving中加载已经训练的模型,需要将训练的模型以SaveModel格式进行保存。

# Fetch the Keras session and save the model
# The signature definition is defined by the input and output tensors,
# and stored with the default serving key
import tempfile

MODEL_DIR = tempfile.gettempdir()
version = 1
export_path = os.path.join(MODEL_DIR, str(version))
print('export_path = {}\n'.format(export_path))
if os.path.isdir(export_path):
  print('\nAlready saved a model, cleaning up\n')
  !rm -r {export_path}

tf.saved_model.simple_save(
    keras.backend.get_session(),
    export_path,
    inputs={'input_image': model.input},
    outputs={t.name:t for t in model.outputs})

print('\nSaved model:')
!ls -l {export_path}

模型保存的位置是/tmp/1。该目录下有:saved_model.pb variables
其中variables的目录有以下2个文件:

variables.data-00000-of-00001  variables.index

检查保存的模型

使用saved_model_cli命令来检测SaveModel中的MetaGraphDefsSignatureDefs。SavedModel包含一个或多个MetaGraphDef,由其标签集进行标识。要提供模型,如果想知道每个模型中的 SignatureDef 是什么类型的,它们的输入和输出是什么。可以通过show 命令,按层次顺序检查 SavedModel 的内容。

saved_model_cli show --dir {export_path} --all

检查结果:

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['input_image'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 28, 28, 1)
        name: Conv1_input:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['Softmax/Softmax:0'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 10)
        name: Softmax/Softmax:0
  Method name is: tensorflow/serving/predict

使用TensorFlow Serving部署服务

安装tensorflow-model-server

最简单的安装方式是采用docker。本文这里采用原生的安装方式。
先更新apt源:

# This is the same as you would do from your command line, but without the [arch=amd64], and no sudo
# You would instead do:
# echo "deb [arch=amd64] http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | sudo tee /etc/apt/sources.list.d/tensorflow-serving.list && \
# curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | sudo apt-key add -

echo "deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | tee /etc/apt/sources.list.d/tensorflow-serving.list && \
curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | apt-key add -
!apt update

更新过程信息:

deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2943  100  2943    0     0   7827      0 --:--:-- --:--:-- --:--:--  7827
OK
Get:1 http://storage.googleapis.com/tensorflow-serving-apt stable InRelease [3,012 B]
Get:2 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ InRelease [3,626 B]
Ign:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease
Get:4 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ Packages [70.5 kB]
Get:5 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server-universal amd64 Packages [365 B]
Ign:6 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  InRelease
Get:7 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  Release [564 B]
Get:8 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Release [564 B]
Get:9 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  Release.gpg [819 B]
Get:10 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Release.gpg [833 B]
Get:11 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic InRelease [21.3 kB]
Get:12 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Get:13 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server amd64 Packages [357 B]
Get:14 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  Packages [113 kB]
Hit:15 http://archive.ubuntu.com/ubuntu bionic InRelease
Get:16 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Packages [19.8 kB]
Get:17 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
Get:18 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic InRelease [15.4 kB]
Get:19 http://security.ubuntu.com/ubuntu bionic-security/restricted amd64 Packages [9,585 B]
Get:20 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic/main amd64 Packages [31.7 kB]
Get:21 http://security.ubuntu.com/ubuntu bionic-security/universe amd64 Packages [769 kB]
Get:22 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]
Get:23 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic/main Sources [1,686 kB]
Get:24 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages [959 kB]
Get:25 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [662 kB]
Get:26 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic/main amd64 Packages [810 kB]
Get:27 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages [1,288 kB]
Get:28 http://security.ubuntu.com/ubuntu bionic-security/multiverse amd64 Packages [5,230 B]
Get:29 http://archive.ubuntu.com/ubuntu bionic-updates/multiverse amd64 Packages [8,284 B]
Get:30 http://archive.ubuntu.com/ubuntu bionic-updates/restricted amd64 Packages [20.3 kB]
Get:31 http://archive.ubuntu.com/ubuntu bionic-backports/universe amd64 Packages [4,227 B]
Fetched 6,755 kB in 9s (735 kB/s)
Reading package lists... Done
Building dependency tree       
Reading state information... Done
107 packages can be upgraded. Run 'apt list --upgradable' to see them.

安装TensorFlow Serving

这里采用apt的安装方式:

apt-get install tensorflow-model-server

安装过程信息:

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  tensorflow-model-server
0 upgraded, 1 newly installed, 0 to remove and 107 not upgraded.
Need to get 151 MB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server amd64 tensorflow-model-server all 1.14.0 [151 MB]
Fetched 151 MB in 3s (46.3 MB/s)
Selecting previously unselected package tensorflow-model-server.
(Reading database ... 131183 files and directories currently installed.)
Preparing to unpack .../tensorflow-model-server_1.14.0_all.deb ...
Unpacking tensorflow-model-server (1.14.0) ...
Setting up tensorflow-model-server (1.14.0) ...

启动TensorFlow Serving

以下采用REST方式启动(另一种是gRPC)。

%%bash --bg 
nohup tensorflow_model_server \
  --rest_api_port=8501 \
  --model_name=fashion_model \
  --model_base_path="${MODEL_DIR}" >server.log 2>&1

其中bg是 将进程搬到后台运行(Background)

查看日志信息:tail server.log
日志信息详情:

2019-09-19 02:18:22.039966: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2019-09-19 02:18:22.041055: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-09-19 02:18:22.054399: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:202] Restoring SavedModel bundle.
2019-09-19 02:18:22.066228: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:311] SavedModel load for tags { serve }; Status: success. Took 27522 microseconds.
2019-09-19 02:18:22.066280: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:103] No warmup data file found at /tmp/1/assets.extra/tf_serving_warmup_requests
2019-09-19 02:18:22.066354: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: fashion_model version: 1}
2019-09-19 02:18:22.067453: I tensorflow_serving/model_servers/server.cc:324] Running gRPC ModelServer at 0.0.0.0:8500 ...
[warn] getaddrinfo: address family for nodename not supported
2019-09-19 02:18:22.068189: I tensorflow_serving/model_servers/server.cc:344] Exporting HTTP/REST API at:localhost:8501 ...
[evhttp_server.cc : 239] RAW: Entering the event loop ...

构建请求

(1)查看待识别的数据:

def show(idx, title):
  plt.figure()
  plt.imshow(test_images[idx].reshape(28,28))
  plt.axis('off')
  plt.title('\n\n{}'.format(title), fontdict={'size': 16})

import random
rando = random.randint(0,len(test_images)-1)
show(rando, 'An Example Image: {}'.format(class_names[test_labels[rando]]))

结果如下:
TensorFlow服务部署-以图像分类为例_第1张图片

(2)数据封装
将上述图像进行封装,下述封装3张图像:

import json
data = json.dumps({"signature_name": "serving_default", "instances": test_images[0:3].tolist()})
print('Data: {} ... {}'.format(data[:50], data[len(data)-52:]))

运行结果:

Data: {"instances": [[[[0.0], [0.0], [0.0], [0.0], [0.0] ... 0.0], [0.0]]]], "signature_name": "serving_default"}

(3)创建REST请求
如果没有安装requests,则先安装pip install -q requests
以POST方式向服务方的REST端发送请求。如果没有指定特殊的可服务版本,默认向最新的版本请求。

import requests
headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/fashion_model:predict', data=data, headers=headers)
predictions = json.loads(json_response.text)['predictions']

show(0, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format(
  class_names[np.argmax(predictions[0])], test_labels[0], class_names[np.argmax(predictions[0])], test_labels[0]))

注意:这里URI中的v1是指版本号,由于本文的模型其实只有一个版本,所以就只能是v1,当存在其他版本模型时候是可以直接修改的。
URI中的models是固定的,fashion_model是之前启动tensorflow_model_server时候指定--model_name参数。
返回结果:
TensorFlow服务部署-以图像分类为例_第2张图片

特定版本的服务

让我们指定servable的一个特定版本。由于只有一个版本,我们选择version 1。我们还将查看所有这三个结果。

headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/fashion_model/versions/1:predict', data=data, headers=headers)
predictions = json.loads(json_response.text)['predictions']

for i in range(0,3):
  show(i, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format(
    class_names[np.argmax(predictions[i])], test_labels[i], class_names[np.argmax(predictions[i])], test_labels[i]))

运行结果:
TensorFlow服务部署-以图像分类为例_第3张图片
TensorFlow服务部署-以图像分类为例_第4张图片
TensorFlow服务部署-以图像分类为例_第5张图片

你可能感兴趣的:(深度学习)