查看CUDA版本:
cat /usr/local/cuda/version.txt
查看文件链接到哪里:
ls -al libcuda.so.1
docker 下 tensorflow_model_server t2t部署
1.安装docker
见菜鸟教程:http://www.runoob.com/docker/docker-tutorial.html
2.下载serving镜像:
docker pull tensorflow/serving:latest-devel(文件较大3G多,下载时间较长)
3.用serving镜像创建容器:
docker run -it -p 9000:9000 tensorflow/serving:latest-devel --privileged=true(调用GPU)
4.将模型拷贝到容器中:(新开个命令窗口)
docker cp [模型文件所在目录] 容器ID:/[容器中目录]
如:
docker cp E:/model/export 0f087sdf8sf:/model
5.容器中运行tensorflow_model_server服务
tensorflow_model_server --port=9000 --model_name=nmt --model_base_path=/model
6.t2t连接
t2t-query-server --server=*.*.*.*:9000 --servable_name=nmt --problem=nmt_zhen --data_dir=/home/data --t2t_usr_dir=/home/script
(...:9000地址同服务器地址,或docker启动时默认地址)GPU版本问题:
1)tensorflow_model_server: error while loading shared libraries: libcuda.so.1: cannot open shared object file: No such file or directory
解决方法:将libcuda.so.1文件放到对应目录下
1)
cp /usr/local/cuda-10.0/compat/libcud* /usr/local/cuda/lib64/
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"
export CUDA_HOME=/usr/local/cuda
2)
升级apt-get: apt-get update
安装vim: apt-get install vim
vi ./root/.bashrc
添加:
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda-10.0/compat"
/usr/local/cuda-10.0/lib64/stubs
export CUDA_HOME=/usr/local/cuda
执行:
source ./root/.bashrc
3)
安装nvidia-docker 解决问题
/usr/local/cuda-10.0/lib64/stubs
加载单模型:
docker run -p 8501:8501
–mount type=bind,source=/path/to/my_model/,target=/models/my_model
-e MODEL_NAME=my_model -t tensorflow/serving
docker run -p 9098:8500 --mount type=bind,source=/opt/data/D_NMT/translate_enzh/export/V1.0/,target=/models/nmt_enzh -e MODEL_NAME=nmt_enzh -t tensorflow/serving
加载多模型:
docker run --runtime=nvidia -p 8500:8500 -p 8501:8501
–mount type=bind,source=/path/to/my_model/,target=/models/my_model
–mount type=bind,source=/path/to/my/models.config,target=/models/models.config
-t tensorflow/serving:latest-gpu --model_config_file=/models/models.config &
安装nvidai-docker问题:
安装教程:https://github.com/NVIDIA/nvidia-docker#quick-start
version `XZ_5.1.2alpha’ not found (required by /lib64/librpmio.so.3)
下载:liblzma.so.5.2.2到/opt/anaconda3/envs/py36/lib目录
执行软连接:sudo ln -s -f liblzma.so.5.2.2 liblzma.so.5
解决问题!
加载多模型:
sudo docker run -d -p 8500:8500 --mounttype=bind,source=/path/to/source_models/model1/,target=/models/model1 --mounttype=bind,source=/path/to/source_models/model2/,target=/models/model2 --mounttype=bind,source=/path/to/source_models/model3/,target=/models/model3 --mounttype=bind,source=/path/to/source_models/model.config,target=/models/model.config -t --name ner tensorflow/serving --model_config_file=/models/model.config
docker run --runtime=nvidia -p 9000:8500 --mount type=bind,source=/opt/data/models/nmt_enzh,target=/models/nmt_enzh --mount type=bind,source=/opt/data/models/nmt_zhen/,target=/models/nmt_zhen --mount type=bind,source=/opt/data/models/model.config,target=/models/model.config -t tensorflow/serving:latest-gpu --model_config_file=/models/model.config
[root@bogon /]# cd /usr/local/cuda-10.0/lib64
[root@bogon lib64]# ls -al libcuda.so.1
lrwxrwxrwx 1 root root 43 Mar 15 14:24 libcuda.so.1 -> /usr/local/cuda-10.0/lib64/stubs/libcuda.so