版本不对应会导致tensorflow镜像无法运行
搭建TensorFlow的GPU版本,必备条件是一块能够支持CUDA的NVIDIA显卡,首先需要安装其基础支持平台CUDA和其机器学习库cuDNN,然后在此基础上搭建对应TensorFlow GPU版本
TensorFlow1.2~2.1各GPU版本CUDA和cuDNN对应版本如下:
sudo nvidia-docker run --rm -it nvidia/cuda:9.0-base nvidia-smi
zkf@zkf-ThinkPad-T490:~$ sudo nvidia-docker run --rm -it nvidia/cuda:9.0-base nvidia-smi
Tue Aug 25 09:16:43 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.130 Driver Version: 384.130 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Graphics Device Off | 00000000:3C:00.0 Off | N/A |
| N/A 52C P8 N/A / N/A | 327MiB / 2001MiB | 21% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
docker info
zkf@zkf-ThinkPad-T490:~$ docker info
Client:
WARNING: No swap limit support
Debug Mode: false
.
.
.
WARNING: No swap limit support
WARNING: No swap limit support
Docker Root Dir: /var/lib/docker #默认PULL路径
.
.
.
Registry Mirrors:
https://zio7xszp.mirror.aliyuncs.com/
Live Restore Enabled: false
du -h --max-depth=1 /var/lib/docker
zkf@zkf-ThinkPad-T490:~$ sudo du -h --max-depth=1 /var/lib/docker
20K /var/lib/docker/plugins
219M /var/lib/docker/overlay2
20K /var/lib/docker/builder
72K /var/lib/docker/buildkit
4.0K /var/lib/docker/swarm
52K /var/lib/docker/network
4.0K /var/lib/docker/containers
1.1M /var/lib/docker/image
4.0K /var/lib/docker/tmp
4.0K /var/lib/docker/runtimes
4.0K /var/lib/docker/trust
28K /var/lib/docker/volumes
220M /var/lib/docker #镜像当前占用的空间
sudo vim /etc/docker/daemon.json
将里面的data-root改为新的docker容器存储位置,添加 "data-root": "/data/docker",
修改完成后长这样
zkf@zkf-ThinkPad-T490:~$ cat /etc/docker/daemon.json
{
"data-root": "/home/dockerimages",
"registry-mirrors": ["https://zio7xszp.mirror.aliyuncs.com"],
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
}
sudo systemctl stop docker
sudo systemctl start docker
sudo systemctl status docker
docker info
zkf@zkf-ThinkPad-T490:~$ docker info
Client:
Debug Mode: false
.
.
.
Docker Root Dir: /home/dockerimages #修改成功
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Registry Mirrors:
https://zio7xszp.mirror.aliyuncs.com/
Live Restore Enabled: false
这里我们使用了deepo镜像:https://hub.docker.com/r/ufoym/deepo/ 其下的 ufoym/deepo:all-py36-jupyter,该镜像收集了大部分深度学习框架,运行在GPU环境,以及配有jupyter。
docker pull ufoym/deepo:all-py36-jupyter
nvidia-docker run -it -d -p 3888:8888 --ipc=host -v /data:/data --name deepo01 ufoym/deepo:all-py36-jupyter jupyter notebook --no-browser --ip=0.0.0.0 --allow-root --NotebookApp.token= --notebook-dir='/data'
参数说明
-v /data:/data:左边是外部路径,右边是内部路径,例如我的文件放在/home/ubuntu/data下,需要挂载到docker内部的路径是/data,则参数配置应该是-v /home/ubuntu/data:/data
–notebook-dir:jupyter工作目录的默认路径,推荐与上面的docker内部数据路径相同,即/data
-p 8888:8888:左边是外部端口,右边是docker镜像端口。如果想将jupyter应用挂载在8080端口,只需修改参数-p 8080:8888即可
–NotebookApp.token:进入jupyter的密码,这里设置的是空
运行成功
root@959f94717541:/# nvidia-smi
Tue Aug 25 10:28:04 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.130 Driver Version: 384.130 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Graphics Device Off | 00000000:3C:00.0 Off | N/A |
| N/A 47C P8 N/A / N/A | 293MiB / 2001MiB | 24% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
http://localhost:3888/tree?