一. 安装 Nvidia Driver 和 CUDA Toolkit
参考: https://blog.csdn.net/xueshengke/article/details/78134991
Nvidia Driver下载链接: https://www.nvidia.cn/Download/index.aspx?lang=cn
cuda-toolkit下载链接: https://developer.nvidia.com/cuda-toolkit-archive
安装dkms: https://blog.csdn.net/qq_41613251/article/details/108488681
1.1 安装Nvidia Driver
1) 查看系统内核版本
# uname -r
# 3.10.0-1062.el7.x86_64 ; 不同操作系统的内核版本会不一样,最好记住它
# df -h ; 确认 boot 目录的空间不少于 300 MB
2) 屏蔽 nouveau 驱动
nouveau 是系统自带的一个显示驱动程序,需要先将其禁用,然后再进行下一步操作,否则在安装显卡驱动时,会提示:You appear to be running an X server …,然后安装失败。分别打开如下两个文件(如果没有就创建一个),并在其中输入如下两句,然后保存。
# vim /etc/modprobe.d/nvidia-installer-disable-nouveau.conf
# vim /lib/modprobe.d/nvidia-installer-disable-nouveau.conf
...
blacklist nouveau
options nouveau modeset=0
3) 重做 initramfs镜像
重做镜像之后启动才会屏蔽驱动,否则无效,重做时应先rm已有驱动,否则会提示无法覆盖。
这一步需要确保 boot 文件目录的空间足够,否则会失败。建议大于 400 MB
# cp /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
# dracut /boot/initramfs-$(uname -r).img $(uname -r) --force
# rm /boot/initramfs-$(uname -r).img.bak ; 这一步可不执行
4) 重启
# systemctl set-default multi-user.target
# init 3
# reboot
5) 预安装组件
预安装一些必需的组件,需要联网
# yum install gcc kernel-devel kernel-headers
执行如下的安装步骤,必需指定 kernel source path,否则会报错;kernel 的版本和系统内核有关,可能会有差别
# cd /to/your/directory/ ; 跳转到驱动所在的目录
# ./NVIDIA-Linux-x86_64-510.47.03.run --kernel-source-path=/usr/src/kernels/3.10.0-1062.el7.x86_64 -k $(uname -r)
执行后,开始解压驱动包,进入安装步骤,可能中间会出现一些警告,但是不影响
Verifying archive integrity... OK
Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86_64 ******.......................................
..................................................................
..................................................................
许可证 -accept
安装 32 位兼容库 -yes
安装顺利完成
检查驱动安装情况
执行如下两条语句,如果出现显卡的型号信息,说明驱动已经安装成功。
# lspci |grep NVIDIA
03:00.0 3D controller: NVIDIA Corporation GP104GL [Tesla P4] (rev a1)
# nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A |
| 30% 20C P0 84W / 350W | 0MiB / 12288MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... Off | 00000000:25:00.0 Off | N/A |
| 31% 22C P0 91W / 350W | 0MiB / 12288MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce ... Off | 00000000:81:00.0 Off | N/A |
| 32% 24C P0 88W / 350W | 0MiB / 12288MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 NVIDIA GeForce ... Off | 00000000:C1:00.0 Off | N/A |
| 33% 22C P0 85W / 350W | 0MiB / 12288MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
1.2 安装CUDA Toolkit
Installation Instructions:
wget https://developer.download.nvidia.com/compute/cuda/11.4.0/local_installers/cuda-repo-rhel7-11-4-local-11.4.0_470.42.01-1.x86_64.rpm
sudo rpm -i cuda-repo-rhel7-11-4-local-11.4.0_470.42.01-1.x86_64.rpm
sudo yum clean all
# sudo yum -y install nvidia-driver-latest-dkms cuda
# sudo yum -y install cuda-drivers
sudo yum -y install cuda # 只需要执行这一条
不需要具体指定 cuda 版本号,系统已经建立了一个链接 cuda -> cuda-11, 如果没有则执行
ln -s /usr/local/cuda-11 /usr/local/cuda # 创建软链接
配置环境变量
# vim /etc/profile
...
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
# source /etc/profile ; 使环境变量立即生效
CUDA 测试
首先,测试 cuda, nvcc 命令是否可用
# cuda ; 按两下 tab 键
cuda cuda-gdb cuda-install-samples-11.4.sh
cudafe++ cuda-gdbserver cuda-memcheck
# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Wed_Jun__2_19:15:15_PDT_2021
Cuda compilation tools, release 11.4, V11.4.48
Build cuda_11.4.r11.4/compiler.30033411_0
1.3 安装 NVIDIA Container Toolkit
参考: https://github.com/triton-inference-server/server/blob/main/docs/quickstart.md
参考: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#install-guide
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo
sudo yum clean expire-cache
sudo yum install -y nvidia-docker2
sudo systemctl restart docker
sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
nvidia-docker镜像加速
安装完nvidia-docker ,之后添加镜像加速,自己可以根据自己的需求添加阿里的镜像加速
sudo vim /etc/docker/daemon.json
{
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
},
"registry-mirrors": ["https://cbrok4rc.mirror.aliyuncs.com"]
}
sudo systemctl daemon-reload
sudo systemctl restart docker
二.利用tensorRT Container和 tensorRTx 生成model.plan (engine)文件
参考: https://github.com/wang-xinyu/tensorrtx/tree/master/yolov5
2.1 利用pt 生成 wts
以下为示例
# cp {tensorrtx}/yolov5/gen_wts.py {ultralytics}/yolov5
# cd {ultralytics}/yolov5
# python gen_wts.py -w yolov5s.pt -o yolov5s.wts
进入conda 虚拟环境 conda activate python38
1) helmet
cp ~/tensorrtx/yolov5/gen_wts.py ~/yolov5_5.0_helmet/ # copy gen_wts.py到 yolov5中
cd ~/yolov5_5.0_helmet/
python gen_wts.py -w yolov5s.pt -o yolov5s.wts
2) mask
3) fire
2.2 部署TensorRT Container
参考: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorrt
参考: https://docs.nvidia.com/deeplearning/tensorrt/container-release-notes/index.html
重点参考: https://medium.com/@penolove15/yolov4-with-triton-inference-server-and-client-6b02f085c622
sudo docker pull nvcr.io/nvidia/tensorrt:21.09-py3
启动tensorrt container 容器:
sudo docker run --gpus all -it \
--name trt_transfer \
-v /home/huaxi:/huaxi nvcr.io/nvidia/tensorrt:21.09-py3 \
/bin/bash
进入一个已经在运行的容器
sudo docker ps -a # 查询要进入容器的ID
sudo docker start -ia 02e73b37db71 # 容器的ID
sudo docker exec -it 02e73b37db71 /bin/bash
进入docker虚拟机后更新apt源:
sed -i s:/archive.ubuntu.com:/mirrors.tuna.tsinghua.edu.cn/ubuntu:g /etc/apt/sources.list
cat /etc/apt/sources.list
apt-get clean
apt-get -y update --fix-missing # 会清空已下载的缓存包,导致libopencv-dev及其依赖包重新下载
安装opencv:
cmake --version # 如果不存在则执行 apt install cmake
apt-get install libopencv-dev # 安装opencv, 安装如果报 connection failed 则重新执行该命令,保证安装成功
2.1.3 利用tensorrt container 生成model.plan(engine)文件
以下为参考示例,不需要执行:
cd {tensorrtx}/yolov5/
// update CLASS_NUM in yololayer.h if your model is trained on custom dataset
mkdir build
cd build
cp {ultralytics}/yolov5/yolov5s.wts {tensorrtx}/yolov5/build
cmake ..
make
sudo ./yolov5 -s [.wts] [.engine] [n/s/m/l/x/n6/s6/m6/l6/x6 or c/c6 gd gw] // serialize model to plan file
sudo ./yolov5 -d [.engine] [image folder] // deserialize and run inference, the images in [image folder] will be processed.
// For example yolov5s
sudo ./yolov5 -s yolov5s.wts yolov5s.engine s
sudo ./yolov5 -d yolov5s.engine ../samples
// For example Custom model with depth_multiple=0.17, width_multiple=0.25 in yolov5.yaml
sudo ./yolov5 -s yolov5_custom.wts yolov5.engine c 0.17 0.25
sudo ./yolov5 -d yolov5.engine ../samples
以下为实际运行的脚本(包括 helmet、 mask、 fire三个模型)
1) helmet engine模型生成
cd /huaxi/tensorrtx/yolov5_helmet/
vim yololayer.h # 已参考tensorrtx GitHub 说明进行修改
vim yolov5.cpp # 已参考tensorrtx GitHub 说明进行修改
mkdir build
cd build
cp /huaxi/yolov5_5.0_helmet/yolov5s.wts /huaxi/tensorrtx/yolov5_helmet/build
cmake ..
make
./yolov5 -s yolov5s.wts yolov5s.engine s # 生成对应的.engine文件
./yolov5 -d yolov5s.engine ../images/ # 测试.engine文件
2) mask engine模型生成
3) fire engine模型生成
三 NVIDIA TRITON INFERENCE SERVER部署engine文件
重点参考:
https://medium.com/@penolove15/yolov4-with-triton-inference-server-and-client-6b02f085c622
https://blog.csdn.net/JulyLi2019/article/details/119875633
3.1 安装 Triton Server Docker Image
注意: TensorRT 8.0.3
# docker pull nvcr.io/nvidia/tritonserver:-py3
sudo docker pull nvcr.io/nvidia/tritonserver:21.09-py3 # 同 tensorrtx container 版本匹配
The model repository is the directory where you place the models that you want Triton to serve.
参考: https://blog.csdn.net/JulyLi2019/article/details/119875633
mkdir ~/Triton/model_repository/helmet_detection/1/
mkdir ~/Triton/plugins/helmet_detection/
cp -R ~/tensorrtx/yolov5_helmet/build/yolov5s.engine ~/Triton/model_repository/helmet_detection/1/model.plan
cp -R ~/tensorrtx/yolov5_helmet/build/libmyplugins.so ~/Triton/plugins/helmet_detection/
创建start_server.sh (路径与docker 创建时映射的路径要匹配)
LD_PRELOAD=/Triton/plugins/helmet_detection/libmyplugins.so tritonserver --model-repository=/Triton/model_repository/
初始启动 Triton Server (Container)
# docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v/full/path/to/docs/examples/model_repository:/models nvcr.io/nvidia/tritonserver:-py3 tritonserver --model-repository=/models
sudo docker run \
--gpus all\
--shm-size=1g \
--ulimit memlock=-1 \
--ulimit stack=67108864 \
-p 8000:8000 -p 8001:8001 -p 8002:8002 \
--name trt_serving \
-v /home/huaxi/Triton:/Triton \
-itd \
nvcr.io/nvidia/tritonserver:21.09-py3 \
/bin/bash /Triton/start_server.sh
进入已启动的docker
sudo docker stop 515f33be25b6
sudo docker ps -a # 查询要进入容器的ID
sudo docker start -ia 515f33be25b6# 启动已停止的容器并进入,容器的ID
sudo docker exec -it 515f33be25b6 /bin/bash # 进入已启动的容器
3.2 部署Triton Server Client 并测试部署的模型
参考: https://blog.csdn.net/JulyLi2019/article/details/119875633
参考: https://github.com/JulyLi2019/tensorrt-yolov5 下载源码
进入conda 虚拟环境 conda activate python38:
conda install -c conda-forge python-rapidjson
pip install tritonclient==2.18.0 # pip uninstall tritonclient
cd ~/Triton_client/triton_client_yolov5
python client_image.py