Nvidia-docker 配置 Tensorrt环境

主机端配置

NVIDIA 驱动安装

NOTE:
NVIDIA 内核驱动版本与系统驱动一定要一致

输入下条命令,查看你的显卡驱动所使用的内核版本

cat /proc/driver/nvidia/version
g@g-Inspiron-5675:~$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  430.26  Tue Jun  4 17:40:52 CDT 2019
GCC version:  gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1) 
g@g-Inspiron-5675:~$ 

我的内核版本为430

输入下条命令,卸载电脑驱动

sudo apt-get purge nvidia*

输入下条命令,把显卡驱动加入ppa(个人软件包文档,仅支持Ubuntu),类似于应用商店

sudo add-apt-repository ppa:graphics-drivers
sudo apt-get update

输入下条命令,重装415版本驱动(大家可以安装适合自己nvidia驱动版本,确保版本号匹配即可)

sudo apt-get install nvidia-415 nvidia-settings nvidia-prime

NOW ,you can check your NVIDIA driver by command nvidia-smi

optional: 可以禁用当前版本的本地更新,命令如下

sudo apt-mark hold nvidia-430

指定版本安装 docker-ce

卸载已经在Ubuntu系统中安装的docker

sudo apt-get remove docker docker-engine docker-ce docker.io

更新apt

sudo apt-get update

安装以下包,以使apt可以通过https来使用repository

sudo apt-get install -y apt-transport-https ca-certificates curl software-properties-common

添加Docker官方的apt-key并更新apt

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo apt-get update

列出所有可用版本docker

sudo apt-cache madison docker-ce

显示如下:

 docker-ce | 5:18.09.4~3-0~ubuntu-xenial | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
 docker-ce | 5:18.09.3~3-0~ubuntu-xenial | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
 docker-ce | 5:18.09.2~3-0~ubuntu-xenial | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
 docker-ce | 5:18.09.1~3-0~ubuntu-xenial | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
 docker-ce | 5:18.09.0~3-0~ubuntu-xenial | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
 docker-ce | 18.06.3~ce~3-0~ubuntu | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
 docker-ce | 18.06.2~ce~3-0~ubuntu | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
 docker-ce | 18.06.1~ce~3-0~ubuntu | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
 docker-ce | 18.06.0~ce~3-0~ubuntu | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
 docker-ce | 18.03.1~ce-0~ubuntu | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
 docker-ce | 18.03.0~ce-0~ubuntu | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
 docker-ce | 17.12.1~ce-0~ubuntu | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages
 docker-ce | 17.12.0~ce-0~ubuntu | https://download.docker.com/linux/ubuntu xenial/stable amd64 Packages

根据需要选择自己需要的版本进行安装

 sudo apt-get install docker-ce=18.06.3~ce~3-0~ubuntu

查看是否安装正常,运行以下命令即可

sudo systemctl start docker
sudo docker info

或者查看docker 版本

docker -v
Docker version 18.06.3-ce, build d7080c1

Nvidia-Docker 2.0 Installation

卸载docker1

docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo apt-get purge nvidia-docker

添加apt源并更新
Set the repository and update

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
# 获取系统版本 eg:我的为ubuntu16.04(不懂可以不用管)
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update

安装 nvidia-docker 2.0

sudo apt-get install nvidia-docker2
sudo pkill -SIGHUP dockerd

此时就是看运气的时候, 如果你的docker版本恰好和apt默认加载的nvidia-docker版本兼容,那么到这里就安装成功了,但是大多数会遇到如下错误:
Nvidia-docker 配置 Tensorrt环境_第1张图片
Abort
测试nvidia-docker

docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi

报错了!!!

g@g-Inspiron-5675:~$ sudo nvidia-docker run --rm nvidia/cuda nvidia-smi
docker: Error response from daemon: Unknown runtime specified nvidia.
See 'docker run --help'.

很正常,我们要指定版本安装自己环境的nvidia-docker
首先check the versions in the github,看看所有可用版本

sudo apt-cache madison nvidia-docker2 nvidia-container-runtime
nvidia-docker2 | 2.0.3+docker18.06.3-3 | https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64  Packages
nvidia-docker2 | 2.0.3+docker18.06.2-2 | https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64  Packages
nvidia-docker2 | 2.0.3+docker18.06.2-1 | https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64  Packages
nvidia-docker2 | 2.0.3+docker18.06.1-1 | https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64  Packages
nvidia-docker2 | 2.0.3+docker18.06.0-1 | https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64  Packages
nvidia-docker2 | 2.0.3+docker18.03.1-1 | https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64  Packages
nvidia-docker2 | 2.0.3+docker18.03.0-1 | https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64  Packages
nvidia-docker2 | 2.0.3+docker17.12.1-1 | https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64  Packages
nvidia-docker2 | 2.0.3+docker17.12.0-1 | https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64  Packages
nvidia-docker2 | 2.0.3+docker17.09.1-1 | https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64  Packages

由于我安装的docker-ce 版本为18.06 所以我选择

sudo apt-get install nvidia-docker2=2.0.3+docker18.06.3-3
sudo pkill -SIGHUP dockerd

FINALY run the nvidia-docker 2.0 again:

docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi

成功了!!!

g@g-Inspiron-5675:~$ sudo nvidia-docker run --rm nvidia/cuda nvidia-smi
Tue Nov 19 06:29:24 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.26       Driver Version: 430.26       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 106...  Off  | 00000000:09:00.0  On |                  N/A |
| 41%   35C    P8     5W / 120W |    547MiB /  6069MiB |      2%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

Tensorrt 安装

懒得写了, 根据官方文档来安装docker版本的tensorrt

https://github.com/NVIDIA/TensorRT#prerequisites

你可能感兴趣的:(人工智能,tensorrt,docker,nvidia-docker,深度学习,docker,tensorrt)