DOCKER ZERO2HERO

  1. docker 安装
    参考链接

  2. nvidia-docker 安装
    仓库地址
    安装文档

2.1 Setting up Docker
Docker-CE on Ubuntu can be setup using Docker’s official convenience script:

curl https://get.docker.com | sh \
  && sudo systemctl --now enable docker

2.2 Setting up NVIDIA Container Toolkit
Setup the stable repository and the GPG key:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
   && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
   && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

2.3 To get access to experimental features such as CUDA on WSL or the new MIG capability on A100, you may want to add the experimental branch to the repository listing:

curl -s -L https://nvidia.github.io/nvidia-container-runtime/experimental/$distribution/nvidia-container-runtime.list | sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list

2.4 Install the nvidia-docker2 package (and dependencies) after updating the package listing:

sudo apt-get update
sudo apt-get install -y nvidia-docker2

2.5 Restart the Docker daemon to complete the installation after setting the default runtime:

sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

This should result in a console output shown below:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.06    Driver Version: 450.51.06    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000000:00:1E.0 Off |                    0 |
| N/A   34C    P8     9W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
  1. docker 加速
    修改daemon.json,加入镜像,代码如下
sudo vim /etc/docker/daemon.json

{"registry-mirrors":["https://hub-mirror.c.163.com/"]}

$ sudo systemctl daemon-reload
$ sudo systemctl restart docker

docker /nvidia-docker 两个daemon.json的内容

daemon-reload
{     "runtimes": {
	"nvidia": {
		"path": "nvidia-container-runtime",
		"runtimeArgs": []
		}
	}
}

{"registry-mirrors":["https://hub-mirror.c.163.com/"]}
daemon.json.registry
{"registry-mirrors":["https://hub-mirror.c.163.com/"]}

参考链接

  1. docker镜像下载

进入docker hub网站
查找关键字,比如 pytorch 并选择 Tag:
找到符合的镜像,比如cuda版本 pytorch版本,cudnn版本
复制脚本,并在命令行运行,比如:

docker pull pytorch/pytorch:1.7.0-cuda11.0-cudnn8-devel
  1. 通过镜像,创建容器,继续配置环境

  2. docker ERROR List

  • Error response from daemon: Unknown runtime specified nvidia

报错的信息显示runtime=nvidia无法识别,这说明daemon.json配置文件出错.
修改/etc/docker/daemon.json(需要管理员权限),添加如下的内容:

{
    "registry-mirrors": ["你的加速仓库地址"],
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
         }	
    }
}

然后重启docker:

$ sudo systemctl daemon-reload
$ sudo systemctl restart docker

你可能感兴趣的:(docker)