docker 安装
参考链接
nvidia-docker 安装
仓库地址
安装文档
2.1 Setting up Docker
Docker-CE on Ubuntu can be setup using Docker’s official convenience script:
curl https://get.docker.com | sh \
&& sudo systemctl --now enable docker
2.2 Setting up NVIDIA Container Toolkit
Setup the stable repository and the GPG key:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
2.3 To get access to experimental features such as CUDA on WSL or the new MIG capability on A100, you may want to add the experimental branch to the repository listing:
curl -s -L https://nvidia.github.io/nvidia-container-runtime/experimental/$distribution/nvidia-container-runtime.list | sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
2.4 Install the nvidia-docker2 package (and dependencies) after updating the package listing:
sudo apt-get update
sudo apt-get install -y nvidia-docker2
2.5 Restart the Docker daemon to complete the installation after setting the default runtime:
sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
This should result in a console output shown below:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.06 Driver Version: 450.51.06 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 |
| N/A 34C P8 9W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
sudo vim /etc/docker/daemon.json
{"registry-mirrors":["https://hub-mirror.c.163.com/"]}
$ sudo systemctl daemon-reload
$ sudo systemctl restart docker
docker /nvidia-docker 两个daemon.json的内容
daemon-reload
{ "runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
}
}
{"registry-mirrors":["https://hub-mirror.c.163.com/"]}
daemon.json.registry
{"registry-mirrors":["https://hub-mirror.c.163.com/"]}
参考链接
进入docker hub网站
查找关键字,比如 pytorch 并选择 Tag:
找到符合的镜像,比如cuda版本 pytorch版本,cudnn版本
复制脚本,并在命令行运行,比如:
docker pull pytorch/pytorch:1.7.0-cuda11.0-cudnn8-devel
通过镜像,创建容器,继续配置环境
docker ERROR List
报错的信息显示runtime=nvidia无法识别,这说明daemon.json配置文件出错.
修改/etc/docker/daemon.json(需要管理员权限),添加如下的内容:
{
"registry-mirrors": ["你的加速仓库地址"],
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
}
然后重启docker:
$ sudo systemctl daemon-reload
$ sudo systemctl restart docker