TensorRT安装部署

一. 安装 Nvidia Driver 和 CUDA Toolkit

参考: https://blog.csdn.net/xueshengke/article/details/78134991
Nvidia Driver下载链接: https://www.nvidia.cn/Download/index.aspx?lang=cn
cuda-toolkit下载链接: https://developer.nvidia.com/cuda-toolkit-archive
安装dkms: https://blog.csdn.net/qq_41613251/article/details/108488681

1.1 安装Nvidia Driver

1) 查看系统内核版本

# uname -r
# 3.10.0-1062.el7.x86_64 ; 不同操作系统的内核版本会不一样,最好记住它
# df -h ; 确认 boot 目录的空间不少于 300 MB

2) 屏蔽 nouveau 驱动
nouveau 是系统自带的一个显示驱动程序,需要先将其禁用,然后再进行下一步操作,否则在安装显卡驱动时,会提示:You appear to be running an X server …,然后安装失败。分别打开如下两个文件(如果没有就创建一个),并在其中输入如下两句,然后保存。

# vim /etc/modprobe.d/nvidia-installer-disable-nouveau.conf
# vim /lib/modprobe.d/nvidia-installer-disable-nouveau.conf
...
blacklist nouveau
options nouveau modeset=0

3) 重做 initramfs镜像
重做镜像之后启动才会屏蔽驱动,否则无效,重做时应先rm已有驱动,否则会提示无法覆盖。
这一步需要确保 boot 文件目录的空间足够,否则会失败。建议大于 400 MB

# cp /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
# dracut /boot/initramfs-$(uname -r).img $(uname -r) --force
# rm /boot/initramfs-$(uname -r).img.bak ; 这一步可不执行

4) 重启

# systemctl set-default multi-user.target
# init 3
# reboot

5) 预安装组件
预安装一些必需的组件,需要联网

# yum install gcc kernel-devel kernel-headers
1.png

执行如下的安装步骤,必需指定 kernel source path,否则会报错;kernel 的版本和系统内核有关,可能会有差别

# cd /to/your/directory/ ; 跳转到驱动所在的目录
# ./NVIDIA-Linux-x86_64-510.47.03.run --kernel-source-path=/usr/src/kernels/3.10.0-1062.el7.x86_64  -k $(uname -r)

执行后,开始解压驱动包,进入安装步骤,可能中间会出现一些警告,但是不影响

Verifying archive integrity... OK
Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86_64 ******.......................................
..................................................................
..................................................................

许可证 -accept


1.png

1.png

安装 32 位兼容库 -yes


1.png

安装顺利完成
1.png

检查驱动安装情况
执行如下两条语句,如果出现显卡的型号信息,说明驱动已经安装成功。

# lspci |grep NVIDIA
03:00.0 3D controller: NVIDIA Corporation GP104GL [Tesla P4] (rev a1)
# nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| 30%   20C    P0    84W / 350W |      0MiB / 12288MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  Off  | 00000000:25:00.0 Off |                  N/A |
| 31%   22C    P0    91W / 350W |      0MiB / 12288MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  NVIDIA GeForce ...  Off  | 00000000:81:00.0 Off |                  N/A |
| 32%   24C    P0    88W / 350W |      0MiB / 12288MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  NVIDIA GeForce ...  Off  | 00000000:C1:00.0 Off |                  N/A |
| 33%   22C    P0    85W / 350W |      0MiB / 12288MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

1.2 安装CUDA Toolkit

1.png

Installation Instructions:

wget https://developer.download.nvidia.com/compute/cuda/11.4.0/local_installers/cuda-repo-rhel7-11-4-local-11.4.0_470.42.01-1.x86_64.rpm
sudo rpm -i cuda-repo-rhel7-11-4-local-11.4.0_470.42.01-1.x86_64.rpm
sudo yum clean all
# sudo yum -y install nvidia-driver-latest-dkms cuda
# sudo yum -y install cuda-drivers
sudo yum -y install cuda   # 只需要执行这一条

不需要具体指定 cuda 版本号,系统已经建立了一个链接 cuda -> cuda-11, 如果没有则执行

ln -s /usr/local/cuda-11 /usr/local/cuda   # 创建软链接

配置环境变量

# vim /etc/profile
...
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
# source /etc/profile ; 使环境变量立即生效

CUDA 测试
首先,测试 cuda, nvcc 命令是否可用

# cuda ; 按两下 tab 键
cuda                          cuda-gdb                      cuda-install-samples-11.4.sh  
cudafe++                      cuda-gdbserver                cuda-memcheck
# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Wed_Jun__2_19:15:15_PDT_2021
Cuda compilation tools, release 11.4, V11.4.48
Build cuda_11.4.r11.4/compiler.30033411_0

1.3 安装 NVIDIA Container Toolkit

参考: https://github.com/triton-inference-server/server/blob/main/docs/quickstart.md
参考: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#install-guide

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
   && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo
sudo yum clean expire-cache
sudo yum install -y nvidia-docker2
sudo systemctl restart docker
sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
1.png

1.png

nvidia-docker镜像加速
安装完nvidia-docker ,之后添加镜像加速,自己可以根据自己的需求添加阿里的镜像加速

sudo vim /etc/docker/daemon.json
{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
     "registry-mirrors": ["https://cbrok4rc.mirror.aliyuncs.com"]
}
sudo systemctl daemon-reload
sudo systemctl restart docker

二.利用tensorRT Container和 tensorRTx 生成model.plan (engine)文件

参考: https://github.com/wang-xinyu/tensorrtx/tree/master/yolov5

2.1 利用pt 生成 wts

以下为示例

# cp {tensorrtx}/yolov5/gen_wts.py {ultralytics}/yolov5
# cd {ultralytics}/yolov5
# python gen_wts.py -w yolov5s.pt -o yolov5s.wts

进入conda 虚拟环境 conda activate python38

1) helmet

cp ~/tensorrtx/yolov5/gen_wts.py ~/yolov5_5.0_helmet/   # copy gen_wts.py到 yolov5中
cd ~/yolov5_5.0_helmet/
python gen_wts.py -w yolov5s.pt -o yolov5s.wts

2) mask

3) fire

2.2 部署TensorRT Container

参考: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorrt
参考: https://docs.nvidia.com/deeplearning/tensorrt/container-release-notes/index.html
重点参考: https://medium.com/@penolove15/yolov4-with-triton-inference-server-and-client-6b02f085c622

1.png

sudo docker pull nvcr.io/nvidia/tensorrt:21.09-py3

启动tensorrt container 容器:

sudo docker run --gpus all -it \
--name trt_transfer \
-v /home/huaxi:/huaxi nvcr.io/nvidia/tensorrt:21.09-py3 \
/bin/bash

进入一个已经在运行的容器

sudo docker ps  -a # 查询要进入容器的ID
sudo docker start -ia 02e73b37db71 # 容器的ID
sudo docker exec -it 02e73b37db71 /bin/bash

进入docker虚拟机后更新apt源:

sed -i s:/archive.ubuntu.com:/mirrors.tuna.tsinghua.edu.cn/ubuntu:g /etc/apt/sources.list
cat /etc/apt/sources.list
apt-get clean
apt-get -y update --fix-missing  # 会清空已下载的缓存包,导致libopencv-dev及其依赖包重新下载

安装opencv:

cmake --version  # 如果不存在则执行  apt install cmake
apt-get install libopencv-dev  # 安装opencv, 安装如果报 connection failed 则重新执行该命令,保证安装成功

2.1.3 利用tensorrt container 生成model.plan(engine)文件

以下为参考示例,不需要执行:

cd {tensorrtx}/yolov5/
// update CLASS_NUM in yololayer.h if your model is trained on custom dataset
mkdir build
cd build
cp {ultralytics}/yolov5/yolov5s.wts {tensorrtx}/yolov5/build
cmake ..
make
sudo ./yolov5 -s [.wts] [.engine] [n/s/m/l/x/n6/s6/m6/l6/x6 or c/c6 gd gw]  // serialize model to plan file
sudo ./yolov5 -d [.engine] [image folder]  // deserialize and run inference, the images in [image folder] will be processed.
// For example yolov5s
sudo ./yolov5 -s yolov5s.wts yolov5s.engine s
sudo ./yolov5 -d yolov5s.engine ../samples
// For example Custom model with depth_multiple=0.17, width_multiple=0.25 in yolov5.yaml
sudo ./yolov5 -s yolov5_custom.wts yolov5.engine c 0.17 0.25
sudo ./yolov5 -d yolov5.engine ../samples

以下为实际运行的脚本(包括 helmet、 mask、 fire三个模型)
1) helmet engine模型生成

cd /huaxi/tensorrtx/yolov5_helmet/
vim yololayer.h     # 已参考tensorrtx GitHub 说明进行修改
vim yolov5.cpp     # 已参考tensorrtx GitHub 说明进行修改
mkdir build
cd build
cp /huaxi/yolov5_5.0_helmet/yolov5s.wts /huaxi/tensorrtx/yolov5_helmet/build
cmake ..
make
./yolov5 -s yolov5s.wts yolov5s.engine s # 生成对应的.engine文件
./yolov5 -d yolov5s.engine ../images/  # 测试.engine文件

2) mask engine模型生成

3) fire engine模型生成

三 NVIDIA TRITON INFERENCE SERVER部署engine文件

重点参考:
https://medium.com/@penolove15/yolov4-with-triton-inference-server-and-client-6b02f085c622
https://blog.csdn.net/JulyLi2019/article/details/119875633

3.1 安装 Triton Server Docker Image

注意: TensorRT 8.0.3

1.png

# docker pull nvcr.io/nvidia/tritonserver:-py3
sudo docker pull nvcr.io/nvidia/tritonserver:21.09-py3  # 同 tensorrtx container 版本匹配

The model repository is the directory where you place the models that you want Triton to serve.
参考: https://blog.csdn.net/JulyLi2019/article/details/119875633

mkdir ~/Triton/model_repository/helmet_detection/1/
mkdir ~/Triton/plugins/helmet_detection/
cp -R ~/tensorrtx/yolov5_helmet/build/yolov5s.engine ~/Triton/model_repository/helmet_detection/1/model.plan
cp -R ~/tensorrtx/yolov5_helmet/build/libmyplugins.so ~/Triton/plugins/helmet_detection/ 

创建start_server.sh (路径与docker 创建时映射的路径要匹配)

LD_PRELOAD=/Triton/plugins/helmet_detection/libmyplugins.so tritonserver --model-repository=/Triton/model_repository/

初始启动 Triton Server (Container)

# docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v/full/path/to/docs/examples/model_repository:/models nvcr.io/nvidia/tritonserver:-py3 tritonserver --model-repository=/models

sudo docker run \
   --gpus all\
   --shm-size=1g \
   --ulimit memlock=-1 \
   --ulimit stack=67108864 \
   -p 8000:8000 -p 8001:8001 -p 8002:8002 \
   --name trt_serving \
   -v /home/huaxi/Triton:/Triton \
   -itd \
   nvcr.io/nvidia/tritonserver:21.09-py3 \
   /bin/bash /Triton/start_server.sh 

进入已启动的docker

sudo docker stop 515f33be25b6
sudo docker ps  -a  # 查询要进入容器的ID
sudo docker start -ia 515f33be25b6# 启动已停止的容器并进入,容器的ID
sudo docker exec -it 515f33be25b6 /bin/bash    # 进入已启动的容器

3.2 部署Triton Server Client 并测试部署的模型

参考: https://blog.csdn.net/JulyLi2019/article/details/119875633
参考: https://github.com/JulyLi2019/tensorrt-yolov5 下载源码

进入conda 虚拟环境 conda activate python38:

conda install -c conda-forge python-rapidjson
pip install tritonclient==2.18.0  # pip uninstall tritonclient

cd ~/Triton_client/triton_client_yolov5
python client_image.py

你可能感兴趣的:(TensorRT安装部署)