docker安装的坑
推荐有条件的,在一开始就用ubuntu18的系统,这样python版本和gcc都比较新,然后下载专业版pycharm,然后用docker配置运行环境。
按照https://docs.docker.com/install/linux/docker-ce/ubuntu/#install-using-the-repository攻略一直安装到Install the latest version of Docker Engine - Community and containerd, or go to the next step to install a specific version: $ sudo apt-get install docker-ce docker-ce-cli containerd.io
这一步时。我发现需要安装更高版本的libseccomp2_2.4.1-0ubuntu0.16.04.2_amd64.deb,所以从https://ubuntu.pkgs.org/16.04/ubuntu-updates-main-amd64/libseccomp2_2.4.1-0ubuntu0.16.04.2_amd64.deb.html这个网站下载了一个,并直接sudp apt install ./deb,升级了libseccomp2。然后docker也就安装成功了。
(base) gpu604@gpu604:~$ sudo docker version
Client: Docker Engine - Community
Version: 19.03.8
API version: 1.40
Go version: go1.12.17
Git commit: afacb8b7f0
Built: Wed Mar 11 01:25:58 2020
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.8
API version: 1.40 (minimum version 1.12)
Go version: go1.12.17
Git commit: afacb8b7f0
Built: Wed Mar 11 01:24:30 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.2.13
GitCommit: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683
(base) gpu604@gpu604:~$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 410.48 Thu Sep 6 06:36:33 CDT 2018
GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.11)
按照https://github.com/NVIDIA/nvidia-docker输入下面的指令
# Add the package repositories
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
安装完成后用Usage测试
#### Test nvidia-smi with the latest official CUDA image
docker run --gpus all nvidia/cuda:10.0-base nvidia-smi
似乎之后就有了sudo docker images
然后
docker pull tensorflow/tensorflow:latest-gpu-jupyter # latest release w/ GPU support and Jupyter
这时sudo docker images也出现了tensorflow
之后又用sudo docker image rm tensorflow/tensorflow:latest-gpu-jupyter删除了
sudo docker ps -a # Lists containers (and tells you which images they are spun from)
sudo docker images # Lists images
sudo docker rm
sudo docker rmi
# Will fail if there is a running instance of that image i.e. container
sudo docker rmi -f
# i.e. same image id given multiple names/tags
# Will still fail if there is a docker container referencing image
最终确认需要的tensorflow gpu版本,需要和自己的cuda版本,驱动版本对应
下载并运行支持 GPU 的 TensorFlow 映像
sudo docker run --gpus all -it --rm tensorflow/tensorflow:2.0.0-gpu-py3 python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
启动bash:sudo docker run --gpus all -it tensorflow/tensorflow:2.0.0-gpu-py3 bash
但提示说你用了root启动,需要输入userid启动才好。所以首先搜索userid:(base) gpu604@gpu604:~$ id -u gpu604
1000,所以我用下面的命令进入bash
sudo docker run -u 1000:1000 --gpus all -it tensorflow/tensorflow:2.0.0-gpu-py3 bash
接下来就可以pip install新的库了,可以用exit退出bash
但我发现pycharm只有专业版才能用docker这样的远程解释器。
所以没办法还是用anaconda了。但是anaconda里面想用新的tensorflow,必须升级一次cuda cudnn,实在太麻烦了。
推荐有条件的,在一开始就用ubuntu18的系统,这样python版本和gcc都比较新,然后下载专业版pycharm,然后用docker配置运行环境。