受够了TensorRT+cuda+opencv+ffmpeg+x264运行环境的部署的繁琐,每次新服务器上部署环境都会花费很大的精力去部署环境,听说nvidia-docker可以省去部署的麻烦,好多人也推荐使用docker方便部署,咱也在网上搜索了下,学习了下,根据网上的资料,开始安装docker学习一下,把学习记录记在这儿,听说要想使用GPU,就要安装Docker-CE和NVIDIA Container Toolkit,好的,开始。
首先,我的机器上没有安装过docker,要先把docker安装上,执行以下脚本,开始安装。
curl https://get.docker.com | sh \
> && sudo systemctl --now enable docker
控制台输出结果如下:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 19325 100 19325 0 0 2210 0 0:00:08 0:00:08 --:--:-- 4718
# Executing docker install script, commit: 3255aa3919e7281693f62855b9d543bb50f04957
+ sudo -E sh -c apt-get update -qq >/dev/null
[sudo] dingxin 的密码:
+ sudo -E sh -c DEBIAN_FRONTEND=noninteractive apt-get install -y -qq apt-transport-https ca-certificates curl >/dev/null
+ sudo -E sh -c mkdir -p /etc/apt/keyrings && chmod -R 0755 /etc/apt/keyrings
+ sudo -E sh -c curl -fsSL "https://download.docker.com/linux/ubuntu/gpg" | gpg --dearmor --yes -o /etc/apt/keyrings/docker.gpg
gpg: WARNING: unsafe ownership on homedir '/home/dingxin/.gnupg'
+ sudo -E sh -c chmod a+r /etc/apt/keyrings/docker.gpg
+ sudo -E sh -c echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu bionic stable" > /etc/apt/sources.list.d/docker.list
+ sudo -E sh -c apt-get update -qq >/dev/null
+ sudo -E sh -c DEBIAN_FRONTEND=noninteractive apt-get install -y -qq --no-install-recommends docker-ce docker-ce-cli containerd.io docker-compose-plugin docker-scan-plugin >/dev/null
+ version_gte 20.10
+ [ -z ]
+ return 0
+ sudo -E sh -c DEBIAN_FRONTEND=noninteractive apt-get install -y -qq docker-ce-rootless-extras >/dev/null
+ sudo -E sh -c docker version
Client: Docker Engine - Community
Version: 20.10.16
API version: 1.41
Go version: go1.17.10
Git commit: aa7e414
Built: Thu May 12 09:17:28 2022
OS/Arch: linux/amd64
Context: default
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 20.10.16
API version: 1.41 (minimum version 1.12)
Go version: go1.17.10
Git commit: f756502
Built: Thu May 12 09:15:33 2022
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.4
GitCommit: 212e8b6fa2f44b9c21b2798135fc6fb7c53efc16
runc:
Version: 1.1.1
GitCommit: v1.1.1-0-g52de29d
docker-init:
Version: 0.19.0
GitCommit: de40ad0
================================================================================
To run Docker as a non-privileged user, consider setting up the
Docker daemon in rootless mode for your user:
dockerd-rootless-setuptool.sh install
Visit https://docs.docker.com/go/rootless/ to learn about rootless mode.
To run the Docker daemon as a fully privileged service, but granting non-root
users access, refer to https://docs.docker.com/go/daemon-access/
WARNING: Access to the remote API on a privileged Docker daemon is equivalent
to root access on the host. Refer to the 'Docker daemon attack surface'
documentation for details: https://docs.docker.com/go/attack-surface/
================================================================================
Synchronizing state of docker.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable docker
安装结束后,查看Docker版本:
docker --version
结果如下:
Docker version 20.10.16, build aa7e414
说明安装成功
执行以下脚本:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
控制台输出如下:
[sudo] dingxin 的密码:
OK
deb https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/$(ARCH) /
#deb https://nvidia.github.io/libnvidia-container/experimental/ubuntu18.04/$(ARCH) /
deb https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/$(ARCH) /
#deb https://nvidia.github.io/nvidia-container-runtime/experimental/ubuntu18.04/$(ARCH) /
deb https://nvidia.github.io/nvidia-docker/ubuntu18.04/$(ARCH) /
安装nvidia-docker2包及其依赖
sudo apt-get update
控制台输出:
命中:1 https://dl.google.com/linux/chrome/deb stable InRelease
命中:2 http://security.ubuntu.com/ubuntu bionic-security InRelease
命中:3 http://cn.archive.ubuntu.com/ubuntu bionic InRelease
命中:4 http://cn.archive.ubuntu.com/ubuntu bionic-updates InRelease
获取:5 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 InRelease [1,484 B]
命中:6 https://linux.teamviewer.com/deb stable InRelease
获取:7 https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/amd64 InRelease [1,481 B]
命中:8 http://cn.archive.ubuntu.com/ubuntu bionic-backports InRelease
获取:9 https://nvidia.github.io/nvidia-docker/ubuntu18.04/amd64 InRelease [1,474 B]
获取:10 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 Packages [18.7 kB]
获取:11 https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/amd64 Packages [7,416 B]
获取:12 https://nvidia.github.io/nvidia-docker/ubuntu18.04/amd64 Packages [4,488 B]
命中:13 https://download.docker.com/linux/ubuntu bionic InRelease
已下载 35.1 kB,耗时 3秒 (12.0 kB/s)
正在读取软件包列表... 完成
接着执行安装nvidia-docker2:
sudo apt-get install -y nvidia-docker2
控制台输出如下:
正在读取软件包列表... 完成
正在分析软件包的依赖关系树
正在读取状态信息... 完成
下列软件包是自动安装的并且现在不需要了:
linux-hwe-5.4-headers-5.4.0-100 linux-hwe-5.4-headers-5.4.0-104 linux-hwe-5.4-headers-5.4.0-105 linux-hwe-5.4-headers-5.4.0-107 linux-hwe-5.4-headers-5.4.0-109 linux-hwe-5.4-headers-5.4.0-42
linux-hwe-5.4-headers-5.4.0-89 linux-hwe-5.4-headers-5.4.0-90 linux-hwe-5.4-headers-5.4.0-91 linux-hwe-5.4-headers-5.4.0-92 linux-hwe-5.4-headers-5.4.0-94 linux-hwe-5.4-headers-5.4.0-96
linux-hwe-5.4-headers-5.4.0-99
使用'sudo apt autoremove'来卸载它(它们)。
将会同时安装下列软件:
libnvidia-container-tools libnvidia-container1 nvidia-container-toolkit
下列【新】软件包将被安装:
libnvidia-container-tools libnvidia-container1 nvidia-container-toolkit nvidia-docker2
升级了 0 个软件包,新安装了 4 个软件包,要卸载 0 个软件包,有 38 个软件包未被升级。
需要下载 1,934 kB 的归档。
解压缩后会消耗 7,730 kB 的额外空间。
获取:1 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 libnvidia-container1 1.9.0-1 [926 kB]
获取:2 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 libnvidia-container-tools 1.9.0-1 [23.9 kB]
获取:3 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 nvidia-container-toolkit 1.9.0-1 [978 kB]
获取:4 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 nvidia-docker2 2.10.0-1 [5,532 B]
已下载 1,934 kB,耗时 58秒 (33.3 kB/s)
正在选中未选择的软件包 libnvidia-container1:amd64。
(正在读取数据库 ... 系统当前共安装有 462588 个文件和目录。)
正准备解包 .../libnvidia-container1_1.9.0-1_amd64.deb ...
正在解包 libnvidia-container1:amd64 (1.9.0-1) ...
正在选中未选择的软件包 libnvidia-container-tools。
正准备解包 .../libnvidia-container-tools_1.9.0-1_amd64.deb ...
正在解包 libnvidia-container-tools (1.9.0-1) ...
正在选中未选择的软件包 nvidia-container-toolkit。
正准备解包 .../nvidia-container-toolkit_1.9.0-1_amd64.deb ...
正在解包 nvidia-container-toolkit (1.9.0-1) ...
正在选中未选择的软件包 nvidia-docker2。
正准备解包 .../nvidia-docker2_2.10.0-1_all.deb ...
正在解包 nvidia-docker2 (2.10.0-1) ...
正在设置 libnvidia-container1:amd64 (1.9.0-1) ...
正在设置 libnvidia-container-tools (1.9.0-1) ...
正在设置 nvidia-container-toolkit (1.9.0-1) ...
正在设置 nvidia-docker2 (2.10.0-1) ...
正在处理用于 libc-bin (2.27-3ubuntu1.5) 的触发器 ...
/sbin/ldconfig.real: /usr/local/cuda-11.4/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8 is not a symbolic link
/sbin/ldconfig.real: /usr/local/cuda-11.4/targets/x86_64-linux/lib/libcudnn_ops_train.so.8 is not a symbolic link
/sbin/ldconfig.real: /usr/local/cuda-11.4/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8 is not a symbolic link
/sbin/ldconfig.real: /usr/local/cuda-11.4/targets/x86_64-linux/lib/libcudnn_adv_train.so.8 is not a symbolic link
/sbin/ldconfig.real: /usr/local/cuda-11.4/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8 is not a symbolic link
/sbin/ldconfig.real: /usr/local/cuda-11.4/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8 is not a symbolic link
安装完成
接着安装镜像
sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
控制台输出结果:
sudo docker images -a
sudo docker ps
控制台输出如下:
貌似没有运行的容器,毕竟我是刚装的docker吗。
拉一个镜像下来,比如ubuntu
sudo docker pull ubuntu