前提:已经安装好了docker和nvidia-docker
1、创建dockerfile所在的文件夹
mkdir tfgpu
cd tfgpu
vim DockerFile
DockerFile内容
RUN echo -e “[global]\nindex-url = https://pypi.mirrors.ustc.edu.cn/simple/” >> ~/pip.conf这一步貌似美起作用,可以不写,直接临时换源下载即可
tensorflow可以直接指定为1.12.0,默认不写会下载为2.1.0,最后还的卸载
FROM nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04
MAINTAINER bobwang
# install basic dependencies
RUN apt-get update
RUN apt-get install -y wget \
vim \
cmake
# install Anaconda3
RUN wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Anaconda3-5.2.0-Linux-x86_64.sh -O ~/anaconda3.sh
RUN bash ~/anaconda3.sh -b -p /home/anaconda3 \
&& rm ~/anaconda3.sh
ENV PATH /home/anaconda3/bin:$PATH
# change mirror
RUN mkdir ~/.pip \
&& cd ~/.pip
RUN echo -e "[global]\nindex-url = https://pypi.mirrors.ustc.edu.cn/simple/" >> ~/pip.conf
# install tensorflow
RUN /home/anaconda3/bin/pip install wrapt --ignore-installed && pip install --upgrade pip -i https://mirrors.aliyun.com/pypi/simple/ && /home/anaconda3/bin/pip install tensorflow-gpu -i https://mirrors.aliyun.com/pypi/simple/
3、制作镜像
docker build -t tf-gpu .
4、运行镜像
sudo nvidia-docker run -it --rm --name test c225e83ca98e /bin/bash
5、使用tensorflow
import tensorflow as tf
报错
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-uGQnzpFB-1587399659477)(/home/bob/.config/Typora/typora-user-images/image-20200420141725441.png)]
关键信息
W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
解决:
可能原因1、tensorRT包缺少
原因2:tensorflow版本与cuda不兼容,
对照表
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-UXvqfWa4-1587399659484)(/home/bob/.config/Typora/typora-user-images/image-20200420151538217.png)]
查看cuda的版本
cat /usr/local/cuda/version.txt
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-y7lWUTD1-1587399659487)(/home/bob/.config/Typora/typora-user-images/image-20200420153933596.png)]
查看cudnn的版本
cat /usr/local/cuda/include/cudnn.h |grep CUDNN_MAJOR -A 2
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-bVl3dKIX-1587399659491)(/home/bob/.config/Typora/typora-user-images/image-20200420154134620.png)]
显示无信息
pip uninstall tensorflow-gpu
pip install tensorflow-gpu==1.12.0 -i https://mirrors.aliyun.com/pypi/simple/
验证是否安装成功
import tensorflow as tf
a = tf.constant([1.0,2.0,3.0,4.0,5.0,6.0],shape=[2,3],name='a')
b = tf.constant([1.0,2.0,3.0,4.0,5.0,6.0],shape=[2,3],name='b')
c = tf.matmul(a, b)
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
print(sess.run(c))
输出:
MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
2020-04-20 08:48:12.497338: I tensorflow/core/common_runtime/placer.cc:927] MatMul: (MatMul)/job:localhost/replica:0/task:0/device:GPU:0
Const: (Const): /job:localhost/replica:0/task:0/device:CPU:0
2020-04-20 08:48:12.497482: I tensorflow/core/common_runtime/placer.cc:927] Const: (Const)/job:localhost/replica:0/task:0/device:CPU:0
a: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2020-04-20 08:48:12.497788: I tensorflow/core/common_runtime/placer.cc:927] a: (Const)/job:localhost/replica:0/task:0/device:GPU:0
b: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2020-04-20 08:48:12.497971: I tensorflow/core/common_runtime/placer.cc:927] b: (Const)/job:localhost/replica:0/task:0/device:GPU:0
[[22. 28.]
[49. 64.]]
# 成功
1、安装环境
pip install numpy==1.16.2 -i https://mirrors.aliyun.com/pypi/simple/
pip install diango==2.2.5 -ihttps://mirrors.aliyun.com/pypi/simple/
pip install filetype
pip install pymysql
pip install django-cors-headers
pip install djangorestframework
pip install opencv-python==4..1.1.26
dpkg --add-architecture i386
apt-get update
apt-get upgrade
apt-get install libsm6
apt-get install libxrender1
apt-get install libxext-dev
pip install dlib
# apt-get libSM-1.2.2-2.e17.x86_64 -- setopt=protected_multilib=false
conda install keras=2.2.4
sudo docker commit -a bobwang -m "tf12gpu" 1bfd0ffe6988 tf-gpu1.12-emotion:1.0
sudo nvidia-docker run --rm -it -p8000:8000 -v /home/bob/tfgpu/emotion-classifier/:/root/projects --name emotion a6313cb19987 /bin/bash
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-7: ordinal not in range(128)
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Q78NOJPS-1587399659494)(/home/bob/.config/Typora/typora-user-images/image-20200420232548068.png)]
原因:这是python的编码问题
解决:
# 修改环境变量
vim /etc/profile
# 添加
export PYTHONIOENCODING=utf-8
export LANG='en_US.UTF-8'
# 是环境变量生效即可
# 在python的环境中检查
import sys
sys.stdout.encoding()
>>:'utf-8'
sudo docker commit -a bobwang -m "tf12gpu-utf8" 34085eec0108 tf-gpu1.12-emotion-utf8:2.0
sha256:8cd080380d7371461df4b65cbc832a7b7dc876110af8fa4bc1aedec2a9813636