docker编译tensorflow serving 源码踩坑记

docker编译tensorflow serving 源码

  • windows 10 安装docker
  • 根据dockerfile编译源码
    • 一、git克隆
    • 二、开始安装带development tools的镜像
    • 三、编译不带开发工具的镜像
    • 四、镜像导出导入
    • 五、serving half_plus_two
    • 六、Serving ResNet-50 v1 Model

windows 10 安装docker

1.Docker下载地址为: https://store.docker.com/editions/community/docker-ce-desktop-windows
2.Docker for Windows 的当前版本运行在64位Windows 10 Pro,专业版、企业版和教育版(1607年纪念更新,版本14393或更高版本)上。
我的电脑是Windows 10 Pro,但是版本是12***,所以需要先升级
3.需要开启Hyper-V虚拟机功能 点击“启用或关闭Windows功能”,然后选中“Hyper-V”
一开始有一个是灰的,选不上,需要进电脑bios开启硬件虚拟化,进入BIOS后按左右键选择configuration(配置),选择下面的SVM SUPPORT(AMD处理器)或者 Intel Virtual Technology(Intel处理器),回车选择enable,按F10回车重启即可。

根据dockerfile编译源码

主要根据以下地址操作https://github.com/IntelAI/models/blob/master/docs/general/tensorflow_serving/InstallationGuide.md#installation
docker 操作参考https://yeasy.gitbooks.io/docker_practice/

一、git克隆

git clone https://github.com/tensorflow/serving.git

二、开始安装带development tools的镜像

cd serving/tensorflow_serving/tools/docker/
可以看到这个目录下有Dockerfile,Dockerfile.devel,Dockerfile.devel-gpu,Dockerfile.devel-mkl,Dockerfile.gpu,Dockerfile.mkl这几个文件
其中mkl为intel的一个优化的库,带devel的是带开发环境的,先要编译带devel的,再编译不带devel的Dockerfile,后者以前者为基础
如果要使用Intel的MKL库
docker build -f Dockerfile.devel-mkl --build-arg TF_SERVING_VERSION_GIT_BRANCH=“1.13.0” -t tensorflow/serving:latest-devel-mkl .
如果不使用:
docker build -f Dockerfile.devel --build-arg TF_SERVING_VERSION_GIT_BRANCH=“1.13.0” -t tensorflow/serving .

如果有gpu,就用带gpu的dockerfile
这里面遇到几个比较大的坑,主要是国内被墙,需要更换apt-get和pip的下载地址为国内镜像
1.apt-get 经常有下载失败的包
在dockerfile中apt-get命令前加入以下命令更换为国内镜像地址
RUN sed -i s@/archive.ubuntu.com/@/mirrors.aliyun.com/@g /etc/apt/sources.list
RUN apt-get clean

这是一个替换,@是分割符和/一样,因为要替换的内容中有/,所以用@
2.Premature EOF
Extracting Bazel installation…
Starting local Bazel server and connecting to it…
ERROR: error loading package ‘’: in /tensorflow-serving/tensorflow_serving/workspace.bzl: Encountered error while reading extension file ‘tensorflow/workspace.bzl’:
no such package ‘@org_tensorflow//tensorflow’: java.io.IOException: Error downloading
[https://mirror.bazel.build/github.com/tensorflow/tensorflow/archive/6612da89516247503f03ef76e974b51a434fb52e.tar.gz,
https://github.com/tensorflow/tensorflow/archive/6612da89516247503f03ef76e974b51a434fb52e.tar.gz]
to /root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/external/org_tensorflow/6612da89516247503f03ef76e974b51a434fb52e.tar.gz: Premature EOF

这个没找到在哪修改地址,只能重试,一般重试几次后就能下下来

3.bazel build的时候报FileType function is not available
Extracting Bazel installation…
Starting local Bazel server and connecting to it…
ERROR: /root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/external/io_bazel_rules_closure/closure/private/defs.bzl:18:17: FileType function is not available. You may use a list of strings instead. You can temporarily reenable the function by passing the flag --incompatible_disallow_filetype=false
ERROR: /root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/external/io_bazel_rules_closure/closure/private/defs.bzl:19:18: FileType function is not available. You may use a list of strings instead. You can temporarily reenable the function by passing the flag --incompatible_disallow_filetype=false
ERROR: /root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/external/io_bazel_rules_closure/closure/private/defs.bzl:20:16: FileType function is not available. You may use a list of strings instead. You can temporarily reenable the function by passing the flag --incompatible_disallow_filetype=false
ERROR: /root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/external/io_bazel_rules_closure/closure/private/defs.bzl:22:21: FileType function is not available. You may use a list of strings instead. You can temporarily reenable the function by passing the flag --incompatible_disallow_filetype=false
ERROR: /root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/external/io_bazel_rules_closure/closure/private/defs.bzl:23:17: FileType function is not available. You may use a list of strings instead. You can temporarily reenable the function by passing the flag --incompatible_disallow_filetype=false
ERROR: error loading package ‘’: in /tensorflow-serving/tensorflow_serving/workspace.bzl: in /root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/external/org_tensorflow/tensorflow/workspace.bzl: in /root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/external/io_bazel_rules_closure/closure/defs.bzl: in /root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/external/io_bazel_rules_closure/closure/stylesheets/closure_css_binary.bzl: Extension file ‘closure/private/defs.bzl’ has errors
ERROR: error loading package ‘’: in /tensorflow-serving/tensorflow_serving/workspace.bzl: in /root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/external/org_tensorflow/tensorflow/workspace.bzl: in /root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/external/io_bazel_rules_closure/closure/defs.bzl: in /root/.cache/bazel/_bazel_root/e53bbb0b0da4e26d24b415310219b953/external/io_bazel_rules_closure/closure/stylesheets/closure_css_binary.bzl: Extension file ‘closure/private/defs.bzl’ has errors

这个安装提示在bazel build 命令里加上–incompatible_disallow_filetype=false

还有一个Using cfg = “data” on an attribute is a noop and no longer supported. Please remove it. You can use --incompatible_disallow_data_transition=false to temporarily disable this check.

加上–incompatible_disallow_data_transition=false 同理

4.编译时突然退出 C++ compilation of rule ‘@aws//:aws’ failed (Exit 4): gcc failed: error executing command
/root/.cache/bazel/bazel_root/e53bbb0b0da4e26d24b415310219b953/external/aws/BUILD.bazel:12:1: C++ compilation of rule ‘@aws//:aws’ failed (Exit 4): gcc failed: error executing command
(cd /root/.cache/bazel/bazel_root/e53bbb0b0da4e26d24b415310219b953/execroot/tf_serving &&
exec env -
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/proc/self/cwd
PYTHON_BIN_PATH=/usr/bin/python
/usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 ‘-D_FORTIFY_SOURCE=1’ -DNDEBUG -ffunction-sections -fdata-sections ‘-std=c++0x’ -MD -MF bazel-out/k8-opt/bin/external/aws/objs/aws/S3Client.d '-frandom-seed=bazel-out/k8-opt/bin/external/aws/objs/aws/S3Client.o’ -DCURL_STATICLIB -DPLATFORM_LINUX -DENABLE_CURL_CLIENT -DENABLE_NO_ENCRYPTION -iquote external/aws -iquote bazel-out/k8-opt/genfiles/external/aws -iquote bazel-out/k8-opt/bin/external/aws -iquote external/curl -iquote bazel-out/k8-opt/genfiles/external/curl -iquote bazel-out/k8-opt/bin/external/curl -iquote external/zlib_archive -iquote bazel-out/k8-opt/genfiles/external/zlib_archive -iquote bazel-out/k8-opt/bin/external/zlib_archive -iquote external/boringssl -iquote bazel-out/k8-opt/genfiles/external/boringssl -iquote bazel-out/k8-opt/bin/external/boringssl -isystem external/aws/aws-cpp-sdk-core/include -isystem bazel-out/k8-opt/genfiles/external/aws/aws-cpp-sdk-core/include -isystem bazel-out/k8-opt/bin/external/aws/aws-cpp-sdk-core/include -isystem external/aws/aws-cpp-sdk-kinesis/include -isystem bazel-out/k8-opt/genfiles/external/aws/aws-cpp-sdk-kinesis/include -isystem bazel-out/k8-opt/bin/external/aws/aws-cpp-sdk-kinesis/include -isystem external/aws/aws-cpp-sdk-s3/include -isystem bazel-out/k8-opt/genfiles/external/aws/aws-cpp-sdk-s3/include -isystem bazel-out/k8-opt/bin/external/aws/aws-cpp-sdk-s3/include -isystem external/curl/include -isystem bazel-out/k8-opt/genfiles/external/curl/include -isystem bazel-out/k8-opt/bin/external/curl/include -isystem external/zlib_archive -isystem bazel-out/k8-opt/genfiles/external/zlib_archive -isystem bazel-out/k8-opt/bin/external/zlib_archive -isystem external/boringssl/src/include -isystem bazel-out/k8-opt/genfiles/external/boringssl/src/include -isystem bazel-out/k8-opt/bin/external/boringssl/src/include ‘-D_GLIBCXX_USE_CXX11_ABI=0’ -fno-canonical-system-headers -Wno-builtin-macro-redefined '-D__DATE=“redacted”’ '-D__TIMESTAMP
=“redacted”’ ‘-D__TIME__=“redacted”’ -c external/aws/aws-cpp-sdk-s3/source/S3Client.cpp -o bazel-out/k8-opt/bin/external/aws/_objs/aws/S3Client.o)
Execution platform: @bazel_tools//platforms:host_platform
gcc: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See for instructions.

status 4表示内存不足,
所以限制bazel线程数,在bazel build命令中添加:
–local_resources=2048,.5,1.0\

5.pip 下载tensorflow-1.13.1-cp27-cp27mu-manylinux1_x86_64.whl超时
https://files.pythonhosted.org/packages/d2/ea/ab2c8c0e81bd051cc1180b104c75a865ab0fc66c89be992c4b20bbf6d624/tensorflow-1.13.1-cp27-cp27mu-manylinux1_x86_64.whl 这个地址下载慢,换成国内镜像
dockerfile中添加
RUN pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

或者新建pip.conf 里面写入:
[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple
[install]
trusted-host=mirrors.aliyun.com
然后在dockerfile中加入
COPY pip.conf /root/.pip/pip.conf

三、编译不带开发工具的镜像

docker build -f Dockerfile.mkl --build-arg TF_SERVING_VERSION_GIT_BRANCH=“1.13.0” -t tensorflow/serving:mkl .

四、镜像导出导入

docker save -o tensorflow.serving.mkl.tar tensorflow/serving:mkl
docker load -i tensorflow.serving.mkl.tar

五、serving half_plus_two

docker run -p 8501:8501 --name tfserving_half_plus_two -v C:/Windows/System32/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu:/models/half_plus_two -e MODEL_NAME=half_plus_two ensorflow/serving:mkl
window系统不支持curl,所以直接用exec进入container执行curl命令

docker exec -it containerid bash
curl -d ‘{“instances”: [1.0, 2.0, 5.0]}’
-X POST http://localhost:8501/v1/models/half_plus_two:predict
返回:
{
“predictions”: [2.5, 3.0, 4.5]
}

在docker run 中加上-e MKLDNN_VERBOSE=1 可以打印mkl执行过程

六、Serving ResNet-50 v1 Model

下载http://download.tensorflow.org/models/official/20181001_resnet/savedmodels/resnet_v1_fp32_savedmodel_NCHW_jpg.tar.gz

然后放到一个linux服务器上解压
tar --strip-components=2 -xvz resnet_v1_fp32_savedmodel_NCHW_jpg.tar.gz
将解压后的1538687457文件夹下载下来放到C:\tmp\resnet ,连同之后要用的C:\Windows\System32\serving\tensorflow_serving\example 里的一堆脚本也放到 C:\tmp\resnet下

docker run -p 8501:8501 --name=tfserving_resnet_restapi -v “C:/tmp/resnet:/models/resnet” -e MODEL_NAME=resnet tensorflow/serving:mkl

然后docker exec -it containerid bash 进入这个container
进到/models/resnet/example 目录,执行
apt-get install -y python python-requests
python resnet_client.py

你可能感兴趣的:(机器学习-tensorflow)