DeepStream使用说明(基于T4卡)

一、简介

DeepStream 是一个流式分析工具包,旨在构建 AI 支持的应用程序。它将流媒体数据作为输入(来自 USB/CSI 摄像机、来自 RTSP 的文件或流的视频),并使用 AI 和计算机视觉从像素生成insights,以便更好地了解环境。DeepStream SDK 可以成为许多视频分析解决方案的基础层。

image

特性

  • 核心 SDK 由多个硬件加速器插件组成,这些插件使用各种加速器,如 VIC、GPU、DLA、NVDEC 和 NVENC。

  • DeepStream 支持边缘和云之间的安全双向通信,DeepStream 船舶使用用户名/密码和双向 TLS 身份验证等几种开箱即用的安全协议进行验证。

  • DeepStream 建立在 CUDA-X 堆栈(如 CUDA、TensorRT、Triton推理服务器和多媒体库)的多个 NVIDIA 库之上。TensorRT加速了NVIDIA GPU上的人工智能推理。DeepStream 在 DeepStream 插件中抽象了这些库,使开发人员无需学习所有单独的库即可轻松构建视频分析管道。

1.DeepStream 图形架构

DeepStream是使用开源 GStreamer 框架构建的优化图形架构。所有单独的块都是使用的各种插件,底部是在整个应用过程中使用的不同硬件引擎。

image
  • 流式数据可以通过 RTSP 或本地文件系统或直接从摄像机传输到网络。流是使用 CPU 捕获的。一旦帧在内存中,它们就会被发送到使用 NVDEC 加速器进行解码,解码的插件称为Gst- nvvideo4linux2。

    • 支持 H.264, H.265, JPEG, MPEG4, MPEG2, VP8, VP9 解码

    • 支持H.264, H.265编码(nvv4l2h264enc、nvv4l2h265enc)

    • 编解码器吞吐量

      image
      image
  • 第二步是可选的图像预处理步骤,输入图像可以在推理前进行预处理,这些插件使用 GPU 或 VIC(视觉图像组合器)。

    • Gst-nvdewarper插件可以从鱼眼或 360 度摄像头减压图像。
    • Gst-nvvideoconvert插件可以在框架上执行颜色格式转换。
  • 第三步是分批帧以获得最佳推理性能,批量使用Gst-nvstreammux插件完成。

  • 第四步是将批量的帧进行推理,可以使用 TensorRT、NVIDIA 的推理加速器runtime完成,也可以在原生框架(如 TensorFlow 或PyTorch)使用 Triton 推理服务器进行。对于Jetson AGX Xavier 和 Xavier NX 可以使用GPU 或 DLA (Deep Learning accelerator) 。

    • 原生 TensorRT 推理使用Gst-nvinfer插件执行
    • 使用 Triton 进行推理使用Gst-nvinferserver插件执行
  • 第五步是跟踪对象。SDK 中有几个内置的参考跟踪器,从高性能到高精度不等。对象跟踪使用Gst-nvtracker插件执行。

  • 第六步是使用Gst-nvdsosd可视化插件创建可视化产品件,如边界框、分割面罩、标签。

  • 第七步输出结果,DeepStream 提出了各种选项:

    • 在屏幕上渲染输出边界框;
    • 将输出保存到本地磁盘;
    • 通过 RTSP 输出流式数据;
    • 将元数据发送到云;
      • Gst-nvmsgconv 能将元数据转换为架构有效载荷
      • Gst-nvmsgbroker 能建立与云的连接并发送遥测数据。
      • 有几个内置的broker协议,如Kafka,MQTT,AMQP和Azure物联网,也可以创建自定义broker适配器。

2.DeepStream SDK 5.1

DeepStream SDK 是一种加速 AI 框架,用于构建智能视频分析 (IVA) 管道。

DeepStream 5.1 for Servers and Workstations

  • This release supports Tesla T4 and V100.

DeepStream 5.1 for Jetson

  • This release supports Jetson TX1, TX2, Nano, NX and AGX Xavier.
Jetson T4 and A100 (x86)
Operating System Ubuntu 18.04 Ubuntu 18.04 RHEL 8
Dependencies CUDA: 10.2.89
cuDNN: 8.0.0+
TensorRT: 7.1.3
JetPack: 4.5.1
CUDA: 11.1
cuDNN: 8.0.0+
TensorRT: 7.2.2
Driver: R460.32+

1)Jetson model Platform and OS Compatibility

DS release DS 5.0 GA, 5.0.1, 5.1 (Unified)
Jetson platforms Nano, AGX Xavier, TX2, TX1, Jetson NX
OS L4T Ubuntu 18.04
JetPack release 4.4 GA (4.5.1 GA for DS 5.1)
L4T release 32.4.3 (32.5.1 for DS 5.1)
CUDA release CUDA 10.2
cuDNN release cuDNN 8.0.0.x
TensorRT release TRT 7.1.3
OpenCV release OpenCV 4.1.1
Vision­Works Vision­Works 1.6
GStreamer GStreamer 1.14.1
Docker image deepstream-l4t:5.0, deepstream-l4t:5.0.1, deepstream-l4t:5.1

2)dGPU model Platform and OS Compatibility

DS release DS 5.0 GA, 5.0.1 (Unified), 5.1
GPU platforms P4, T4, V100, GA100 (DS 5.1)
OS Ubuntu 18.04 RHEL 8.x
GCC GCC 7.3.0
CUDA release CUDA 10.2 ( Cuda 11.1 for DS 5.1)
cuDNN release cuDNN 7.6.5+ (CuDNN 8.0+ for DS 5.1)
TRT release TRT 7.0.0 (TRT 7.2.X for DS 5.1)
Display Driver R450.51 (R460.32 for DS 5.1)
VideoSDK release SDK 9.1
OFSDK release 1.0.10
GStreamer release GStreamer 1.14.1
OpenCV release OpenCV 3.4.0
Docker image deepstream:5.0, deepstream:5.0.1, deepstream:5.1

3.Transfer Learning Toolkit

NVIDIA TAO(Train, Adapt, and Optimize)是一个可以简化和加速企业 AI 应用和服务创建的 AI 模型自适应平台。通过基于用户界面的指导性工作流程,让用户可以使用自定义数据对预训练模型进行微调,无需掌握大量训练运行和深度 AI 专业知识,在数小时内(原本需要数月)产生高度精确的计算机视觉、语音和语言理解模型。

迁移学习工具包TLT(Transfer Learning Toolkit)是NVIDIA TAO平台的核心组件,基于python的AI工具包,提供对预训练模型的迁移训练、模型剪纸、量化的一站式解决方案。

image

Nvidia在NGC仓库中提供了一组为TLT工具维护的预训练模型,囊括了常见CV任务的经典模型(人脸识别、目标检测、语义分割、人体姿态估计、分类等)。

image

1)NVIDIA AI工作流程

  • 从 NVIDIA 的预训练模型库或模型架构中进行选择所需模型

    1.png
  • 根据应用程序快速训练、调整和优化模型

    2.png
  • 将自定义模型集成到应用程序中并进行部署

    3.png

二、dGPU Setup for Ubuntu

Refer to NVIDIA GPU expansion card products such as NVIDIA Tesla® T4 and P4, NVIDIA GeForce® GTX 1080, and NVIDIA GeForce® RTX 2080.

1.安装依赖

Remove all previous DeepStream installations

cd /opt/nvidia/deepstream/deepstream/
sudo bash ./uninstall.sh

Install Dependencies

$ sudo apt install \
libssl1.0.0 \
libgstreamer1.0-0 \
libgstreamer1.0-dev \
gstreamer1.0-tools \
gstreamer1.0-plugins-good \
gstreamer1.0-plugins-bad \
gstreamer1.0-plugins-ugly \
gstreamer1.0-libav \
libgstrtspserver-1.0-0 \
libjansson4

2.直接安装GPU驱动【不建议】

https://wangjunjian.com/gpu/2020/11/03/install-nvidia-gpu-driver-on-ubuntu.html

1)卸载旧显卡驱动

#卸载显卡重装
sudo apt-get remove nvidia*    #注意此时千万不能重启,重新电脑可能会导致无法进入系统。
sudo apt-get install autoremove --purge nvidia*
sudo /usr/bin/nvidia-uninstall 

2)禁用nouveau

安装nvidia显卡驱动首先需要禁用nouveau,不然会碰到冲突的问题,导致无法安装nvidia显卡驱动。

  • sudo vim /etc/modprobe.d/blacklist.conf在末尾加上以下两句

    blacklist nouveau
    options nouveau modeset=0
    
  • 重新生成 kernel initramfs:
    sudo update-initramfs -u

  • 重启
    sudo reboot

  • 检查nouveau是否被禁用,返回空则成功禁用

    sudo lsmod | grep nouveau

3)Install NVIDIA driver 460.32

查看显卡型号

$ lspci | grep -i nvidia
5e:00.0 3D controller: NVIDIA Corporation Device 1eb8 (rev a1)

下载通用驱动

  • 注意:安装专门T4驱动没有cuda 11.1版本,会导致后续安装失败
  • 默认附带安装CUDA Version: 11.2
$ chmod 755 NVIDIA-Linux-x86_64-460.32.03.run
$ ./NVIDIA-Linux-x86_64-460.32.03.run --no-opengl-files --no-x-check --no-nouveau-check
# sudo ./NVIDIA-Linux-x86_64-440.64.00.run --no-opengl-files --no-x-check --no-nouveau-check --kernel-source-path=/usr/src/linux-headers-5.4.0-65-generic
  • 显示驱动信息
$ nvidia-smi -L
GPU 0: Tesla T4 (UUID: GPU-158692f4-a7b8-dbe6-3376-65b22b16068d)
$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:5E:00.0 Off |                  Off |
| N/A   52C    P0    22W /  70W |      0MiB / 16127MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

3)卸载NVDIA驱动

sudo ./NVIDIA-Linux-x86_64-460.32.03.run --uninstall

3.安装CUDA 11.1和NVDIA驱动【推荐】

任然需要禁用nouveau

  • 下载cuda_11.1并安装

    参考

    安装cuda同时也会安装NVDIA驱动

    wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
    sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
    wget https://developer.download.nvidia.com/compute/cuda/11.1.1/local_installers/cuda-repo-ubuntu1804-11-1-local_11.1.1-455.32.00-1_amd64.deb
    sudo dpkg -i cuda-repo-ubuntu1804-11-1-local_11.1.1-455.32.00-1_amd64.deb
    sudo apt-key add /var/cuda-repo-ubuntu1804-11-1-local/7fa2af80.pub
    sudo apt-get update
    sudo apt-get -y install cuda
    # 安装路径
    ls  /usr/local/cuda-11.1/
    
  • 检验驱动是否安装成功,重装cuda会自动覆盖旧版本

    $ nvidia-smi -L
    GPU 0: Tesla T4 (UUID: GPU-158692f4-a7b8-dbe6-3376-65b22b16068d)
    
    $ nvidia-smi
    +-----------------------------------------------------------------------------+
    | NVIDIA-SMI 455.32.00    Driver Version: 455.32.00    CUDA Version: 11.1     |
    |-------------------------------+----------------------+----------------------+
    | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
    |                               |                      |               MIG M. |
    |===============================+======================+======================|
    |   0  Tesla T4            Off  | 00000000:5E:00.0 Off |                  Off |
    | N/A   58C    P0    30W /  70W |      0MiB / 16127MiB |      4%      Default |
    |                               |                      |                  N/A |
    +-------------------------------+----------------------+----------------------+
                                                                                   
    +-----------------------------------------------------------------------------+
    | Processes:                                                                  |
    |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
    |        ID   ID                                                   Usage      |
    |=============================================================================|
    |  No running processes found                                                 |
    +-----------------------------------------------------------------------------+
    
  • 配置cuda环境变量

    sudo vim ~/.bashrc
    # 在末尾加入
      export PATH=/usr/local/cuda-11.1/bin:$PATH
      export LD_LIBRARY_PATH=/usr/local/cuda-11.1/lib64:$LD_LIBRARY_PATH
      export LD_LIBRARY_PATH=/usr/local/cuda-11.1/targets/x86_64-linux/lib/:/usr/local/cuda-11.1/targets/x86_64-linux/lib/stubs:$LD_LIBRARY_PATH
      
    source  ~/.bashrc
    
    # 查看cuda版本
    nvcc --version
    
  • 卸载cuda

    cuda-uninstaller
    

4.安装TensorRT 7.2.2

https://developer.nvidia.com/nvidia-tensorrt-download

https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-723/install-guide/index.html#installing-debian

sudo dpkg -i nv-tensorrt-repo-ubuntu1804-cuda11.1-trt7.2.2.3-ga-20201211_1-1_amd64.deb
sudo apt-key add /var/nv-tensorrt-repo-cuda11.1-trt7.2.2.3-ga-20201211/7fa2af80.pub
sudo apt-get update
sudo apt-get install tensorrt 

# If using Python 2.7:
sudo apt-get install python-libnvinfer-dev
# If using Python 3.x:
sudo apt-get install python3-libnvinfer-dev

5.安装cudnn

https://developer.nvidia.com/rdp/cudnn-archive#a-collapse811-111

#------------------安装cudnn8.1-------------------------------------
tar -xzvf cudnn-11.2-linux-x64-v8.1.1.33.tgz
sudo cp cuda/include/* /usr/local/cuda/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
sudo chmod a+r /usr/local/cuda/include/cudnn.h
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*

6.安装DeepStream SDK

$ wget https://developer.nvidia.com/deepstream-51-510-1-amd64deb
$ sudo apt-get install ./deepstream-5.1_5.1.0-1_amd64.deb

$ deepstream-app --version-all
(gst-plugin-scanner:27728): GStreamer-WARNING **: 17:12:39.280: Failed to load plugin '/usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_inferserver.so': libtritonserver.so: cannot open shared object file: No such file or directory
deepstream-app version 5.1.0
DeepStreamSDK 5.1.0
CUDA Driver Version: 11.1
CUDA Runtime Version: 11.1
TensorRT Version: 7.2
cuDNN Version: 8.1
libNVWarp360 Version: 2.0.1d3

# 备注:这是一个无害的警告,表示nvinferserver无法使用DeepStream 的插件,因为 x86(dGPU) 平台上未安装“Triton Inference Server”。

nvds插件

$ gst-inspect-1.0 --plugin|grep nvds
nvdsgst_infer:  nvinfer: NvInfer plugin
nvdsgst_ofvisual:  nvofvisual: nvofvisual
nvdsgst_osd:  nvdsosd: NvDsOsd plugin
nvdsgst_inferaudio:  nvinferaudio: NvInfer Audio plugin
nvdsgst_dsanalytics:  nvdsanalytics: DsAnalytics plugin
nvdsgst_jpegdec:  nvjpegdec: JPEG image decoder
nvdsgst_audiotemplate:  nvdsaudiotemplate: DS AUDIO template Plugin for Transform IP use-cases
nvdsgst_msgbroker:  nvmsgbroker: Message Broker
nvdsgst_dewarper:  nvdewarper: nvdewarper
nvdsgst_multistream:  nvstreamdemux: Stream demultiplexer
nvdsgst_multistream:  nvstreammux: Stream multiplexer
nvdsgst_multistreamtiler:  nvmultistreamtiler: Stream Tiler DS
nvdsgst_tracker:  nvtracker: NvTracker plugin
nvdsgst_videotemplate:  nvdsvideotemplate: NvDsVideoTemplate plugin for Transform/In-Place use-cases
nvdsgst_segvisual:  nvsegvisual: nvsegvisual
nvdsgst_of:  nvof: nvof
nvdsgst_eglglessink:  nveglglessink: EGL/GLES vout Sink
nvdsgst_dsexample:  dsexample: DsExample plugin
nvdsgst_msgconv:  nvmsgconv: Message Converter
nvvideoconvert:  nvvideoconvert: NvVidConv Plugin
nvvideo4linux2:  nvv4l2h265enc: V4L2 H.265 Encoder
nvvideo4linux2:  nvv4l2h264enc: V4L2 H.264 Encoder
nvvideo4linux2:  nvv4l2decoder: NVIDIA v4l2 video decoder

$ gst-inspect-1.0 nvdsgst_infer
Plugin Details:
  Name                     nvdsgst_infer
  Description              NVIDIA DeepStreamSDK TensorRT plugin
  Filename                 /usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_infer.so
  Version                  5.1.0
  License                  Proprietary
  Source module            nvinfer
  Binary package           NVIDIA DeepStreamSDK TensorRT plugin
  Origin URL               http://nvidia.com/

  nvinfer: NvInfer plugin

  1 features:
  +-- 1 elements

安装目录

$ tree /opt/nvidia/deepstream/deepstream-5.1 -L 2
/opt/nvidia/deepstream/deepstream-5.1
├── bin # 测试二进制程序
│   ├── deepstream-app
│   ├── deepstream-appsrc-test
│   ├── deepstream-audio
│   ├── deepstream-dewarper-app
│   ├── deepstream-gst-metadata-app
│   ├── deepstream-image-decode-app
│   ├── deepstream-image-meta-test
│   ├── deepstream-infer-tensor-meta-app
│   ├── deepstream-mrcnn-app
│   ├── deepstream-nvdsanalytics-test
│   ├── deepstream-nvof-app
│   ├── deepstream-opencv-test
│   ├── deepstream-perf-demo
│   ├── deepstream-segmentation-app
│   ├── deepstream-test1-app
│   ├── deepstream-test2-app
│   ├── deepstream-test3-app
│   ├── deepstream-test4-app
│   ├── deepstream-test5-app
│   ├── deepstream-testsr-app
│   ├── deepstream-transfer-learning-app
│   └── deepstream-user-metadata-app
├── install.sh
├── lib     # C++ 插件动态库
│   ├── gst-plugins
│   │   ├── libgstnvvideo4linux2.so
│   │   ├── libgstnvvideoconvert.so
│   │   ├── libnvdsgst_audiotemplate.so
│   │   ├── libnvdsgst_dewarper.so
│   │   ├── libnvdsgst_dsanalytics.so
│   │   ├── libnvdsgst_dsexample.so
│   │   ├── libnvdsgst_eglglessink.so
│   │   ├── libnvdsgst_inferaudio.so
│   │   ├── libnvdsgst_inferserver.so
│   │   ├── libnvdsgst_infer.so
│   │   ├── libnvdsgst_jpegdec.so
│   │   ├── libnvdsgst_msgbroker.so
│   │   ├── libnvdsgst_msgconv.so
│   │   ├── libnvdsgst_multistream_2.a
│   │   ├── libnvdsgst_multistream.so
│   │   ├── libnvdsgst_multistreamtiler.so
│   │   ├── libnvdsgst_of.so
│   │   ├── libnvdsgst_ofvisual.so
│   │   ├── libnvdsgst_osd.so
│   │   ├── libnvdsgst_segvisual.so
│   │   ├── libnvdsgst_tracker.so
│   │   └── libnvdsgst_videotemplate.so
│   ├── libcuvidv4l2.so
│   ├── libiothub_client.so
│   ├── libiothub_client.so.1 -> /opt/nvidia/deepstream/deepstream-5.1/lib/libiothub_client.so
│   ├── libnvbuf_fdmap.so
│   ├── libnvbufsurface.so
│   ├── libnvbufsurftransform.so
│   ├── libnvds_amqp_proto.so
│   ├── libnvds_audiotransform.so
│   ├── libnvds_azure_edge_proto.so
│   ├── libnvds_azure_proto.so
│   ├── libnvds_batch_jpegenc.so
│   ├── libnvdsbufferpool.so
│   ├── libnvds_csvparser.so
│   ├── libnvds_dewarper.so
│   ├── libnvds_dsanalytics.so
│   ├── libnvdsgst_audio.so
│   ├── libnvdsgst_bufferpool.so
│   ├── libnvdsgst_helper.so
│   ├── libnvdsgst_inferbase.so
│   ├── libnvdsgst_meta.so
│   ├── libnvdsgst_smartrecord.so
│   ├── libnvdsgst_tensor.so
│   ├── libnvdsinfer_custom_impl_fasterRCNN.so
│   ├── libnvdsinfer_custom_impl_ssd.so
│   ├── libnvdsinfer_custom_impl_Yolo.so
│   ├── libnvds_infer_custom_parser_audio.so
│   ├── libnvds_infercustomparser.so
│   ├── libnvds_infer_server.so
│   ├── libnvds_infer.so
│   ├── libnvds_inferutils.so
│   ├── libnvds_kafka_proto.so
│   ├── libnvds_lljpegdec.so
│   ├── libnvds_logger.so
│   ├── libnvds_meta.so
│   ├── libnvds_mot_iou.so
│   ├── libnvds_mot_klt.so
│   ├── libnvds_msgbroker.so
│   ├── libnvds_msgconv_audio.so
│   ├── libnvds_msgconv.so
│   ├── libnvds_nvdcf.so
│   ├── libnvds_nvtxhelper.so
│   ├── libnvds_opticalflow_dgpu.so
│   ├── libnvds_osd.so
│   ├── libnvds_redis_proto.so
│   ├── libnvds_tracker.so
│   ├── libnvds_utils.so
│   ├── libnvv4l2.so
│   ├── libnvv4lconvert.so
│   ├── libnvvpi.so.1 -> /opt/nvidia/deepstream/deepstream-5.1/lib/libnvvpi.so.1.0.12
│   ├── libnvvpi.so.1.0.12
│   ├── libv4l
│   ├── pkg-config
│   ├── pyds.so
│   └── setup.py
├── LicenseAgreement.pdf
├── LICENSE.txt
├── README
├── README.rhel
├── samples # 示例应用程序
│   ├── configs # 示例配置文件
│   │   ├── deepstream-app
│   │   │   ├── config_infer_primary_endv.txt
│   │   │   ├── config_infer_primary_nano.txt   # 在nano上将 nvinfer 元素配置为主要检测器
│   │   │   ├── config_infer_primary.txt    # 将 nvinfer 元素配置为主要检测器
│   │   │   ├── config_infer_secondary_carcolor.txt # 将 nvinfer元素配置为辅助分类器
│   │   │   ├── config_infer_secondary_carmake.txt
│   │   │   ├── config_infer_secondary_vehicletypes.txt
│   │   │   ├── config_mux_source30.txt
│   │   │   ├── config_mux_source4.txt
│   │   │   ├── iou_config.txt
│   │   │   ├── source1_usb_dec_infer_resnet_int8.txt   # 一个USB摄像机作为输入
│   │   │   ├── source30_1080p_dec_infer-resnet_tiled_display_int8.txt  # 30路1080P视频输入解码、推理、显示
│   │   │   ├── source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8_gpu1.txt # 在gpu1上4路1080P视频输入解码、推理、跟踪、显示
│   │   │   ├── source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt # 4路1080P视频输入解码、推理、跟踪、显示
│   │   │   └── tracker_config.yml
│   │   ├── deepstream-app-trtis
│   │   └── tlt_pretrained_models   # tlt预训练模型配置文件,还行需要下载对应模型到 samples/models/tlt_pretrained_models路径
│   │       ├── config_infer_primary_dashcamnet.txt
│   │       ├── config_infer_primary_detectnet_v2.txt
│   │       ├── config_infer_primary_dssd.txt
│   │       ├── config_infer_primary_facedetectir.txt
│   │       ├── config_infer_primary_frcnn.txt
│   │       ├── config_infer_primary_mrcnn.txt
│   │       ├── config_infer_primary_peoplenet.txt
│   │       ├── config_infer_primary_retinanet.txt
│   │       ├── config_infer_primary_ssd.txt
│   │       ├── config_infer_primary_trafficcamnet.txt
│   │       ├── config_infer_primary_yolov3.txt
│   │       ├── config_infer_secondary_vehiclemakenet.txt
│   │       ├── config_infer_secondary_vehicletypenet.txt
│   │       ├── deepstream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt
│   │       ├── deepstream_app_source1_detection_models.txt
│   │       ├── deepstream_app_source1_facedetectir.txt
│   │       ├── deepstream_app_source1_mrcnn.txt
│   │       ├── deepstream_app_source1_peoplenet.txt
│   │       ├── deepstream_app_source1_trafficcamnet.txt
│   │       ├── detectnet_v2_labels.txt
│   │       ├── dssd_labels.txt
│   │       ├── frcnn_labels.txt
│   │       ├── labels_dashcamnet.txt
│   │       ├── labels_facedetectir.txt
│   │       ├── labels_peoplenet.txt
│   │       ├── labels_trafficnet.txt
│   │       ├── labels_vehiclemakenet.txt
│   │       ├── labels_vehicletypenet.txt
│   │       ├── mrcnn_labels.txt
│   │       ├── README
│   │       ├── retinanet_labels.txt
│   │       ├── ssd_labels.txt
│   │       └── yolov3_labels.txt
│   ├── models      # 示例模型
│   │   ├── Primary_Detector    # 一级检测器
│   │   │   ├── cal_trt.bin
│   │   │   ├── labels.txt
│   │   │   ├── resnet10.caffemodel
│   │   │   └── resnet10.prototxt
│   │   ├── Primary_Detector_Nano   # 一级检测器,适用于nano
│   │   │   ├── labels.txt
│   │   │   ├── resnet10.caffemodel
│   │   │   └── resnet10.prototxt
│   │   ├── Secondary_CarColor  # 二级检测器,车辆颜色分类
│   │   │   ├── cal_trt.bin
│   │   │   ├── labels.txt
│   │   │   ├── mean.ppm
│   │   │   ├── resnet18.caffemodel
│   │   │   └── resnet18.prototxt
│   │   ├── Secondary_CarMake
│   │   │   ├── cal_trt.bin
│   │   │   ├── labels.txt
│   │   │   ├── mean.ppm
│   │   │   ├── resnet18.caffemodel
│   │   │   └── resnet18.prototxt
│   │   ├── Secondary_VehicleTypes  # 二级检测器,车辆种类分类
│   │   │   ├── cal_trt.bin
│   │   │   ├── labels.txt
│   │   │   ├── mean.ppm
│   │   │   ├── resnet18.caffemodel
│   │   │   └── resnet18.prototxt
│   │   ├── Segmentation    # 分割模型
│   │   │   ├── industrial
│   │   │   └── semantic
│   │   └── SONYC_Audio_Classifier
│   │       ├── audio_labels_car.txt
│   │       ├── audio_labels.txt
│   │       ├── sonyc_audio_classifier.onxx
│   │       ├── sonyc_audio_classify_car.onxx
│   │       └── sonyc_audio_classify.onnx
│   ├── prepare_classification_test_video.sh
│   ├── prepare_ds_trtis_model_repo.sh
│   ├── streams     # 流媒体文件
│   │   ├── sample_1080p_h264.mp4
│   │   ├── sample_1080p_h265.mp4
│   │   ├── sample_720p.h264
│   │   ├── sample_720p.jpg
│   │   ├── sample_720p.mjpeg
│   │   ├── sample_720p.mp4
│   │   ├── sample_cam5.mp4
│   │   ├── sample_cam6.mp4
│   │   ├── sample_industrial.jpg
│   │   ├── sample_qHD.h264
│   │   ├── sample_qHD.mp4
│   │   ├── sonyc_mixed_audio.wav
│   │   ├── yoga.jpg
│   │   └── yoga.mp4
│   └── trtis_model_repo
├── sources
│   ├── apps    # deepstream-app的测试代码
│   │   ├── apps-common
│   │   │   ├── includes
│   │   │   └── src
│   │   └── sample_apps     # 示例应用程序
│   │       ├── deepstream-app
│   │       ├── deepstream-appsrc-test
│   │       ├── deepstream-audio
│   │       ├── deepstream-dewarper-test
│   │       ├── deepstream-gst-metadata-test
│   │       ├── deepstream-image-decode-test
│   │       ├── deepstream-image-meta-test
│   │       ├── deepstream-infer-tensor-meta-test
│   │       ├── deepstream-mrcnn-app
│   │       ├── deepstream-nvdsanalytics-test
│   │       ├── deepstream-nvof-test
│   │       ├── deepstream-opencv-test
│   │       ├── deepstream-perf-demo
│   │       ├── deepstream-segmentation-test
│   │       ├── deepstream-test1
│   │       ├── deepstream-test2
│   │       ├── deepstream-test3
│   │       ├── deepstream-test4
│   │       ├── deepstream-test5
│   │       ├── deepstream-testsr
│   │       ├── deepstream-transfer-learning-app
│   │       └── deepstream-user-metadata-test
│   ├── gst-plugins # nvidia gst 插件程序
│   │   ├── gst-dsexample
│   │   ├── gst-nvdsaudiotemplate
│   │   ├── gst-nvdsosd
│   │   ├── gst-nvdsvideotemplate
│   │   ├── gst-nvinfer     # TensorRT的推理插件
│   │   ├── gst-nvmsgbroker
│   │   └── gst-nvmsgconv
│   ├── includes
│   ├── libs    # gst-plugins依赖库
│   │   ├── amqp_protocol_adaptor
│   │   ├── azure_protocol_adaptor
│   │   ├── kafka_protocol_adaptor
│   │   ├── nvdsinfer   # NvDsInfer library
│   │   ├── nvdsinfer_customparser  #  自定义模型的边界框解析函数模板
│   │   ├── nvmsgbroker
│   │   ├── nvmsgconv
│   │   ├── nvmsgconv_audio
│   │   └── redis_protocol_adaptor
│   ├── objectDetector_FasterRCNN   # Faster RCNN detector
│   ├── objectDetector_SSD  # UFF SSD detector
│   ├── objectDetector_Yolo # Yolo detector
│   ├── SONYCAudioClassifier
│   └── tools
├── uninstall.sh
└── version

7.安装tao-converter

tao-converter用于将tlt预训练模型文件(.etlt) 转换为TensorRT engine 文件 (.engine) ,可以优化模型速度、精度和稳定性。

下载对应版本的tao-converter

# 解压文件
cd ~/Downloads
unzip cuda11.1-trt7.2-20210820T231205Z-001.zip
cd cuda11.1-trt7.2
chmod 777 tao-converter
sudo cp ~/Downloads/cuda11.1-trt7.2/tao-converter /usr/local/bin/

# 设置环境变量,针对root及普通用户都要
sudo vim ~/.bashrc
# 在末尾加入
    export TRT_LIB_PATH="/usr/lib/x86_64-linux-gnu"
    export TRT_INC_PATH="/usr/include/x86_64-linux-gnu"
source ~/.bashrc

tao-converter -h

三、Sample Apps Source

1. C/C++ Sample Apps Source

The sources directory is located at /opt/nvidia/deepstream/deepstream-5.1/sources

DeepStream C/C++ API

安装依赖

sudo apt-get install libgstreamer-plugins-base1.0-dev libgstreamer1.0-dev \
   libgstrtspserver-1.0-dev libx11-dev libjson-glib-dev

1)deepstream-app

鉴于通过DS的底层C接口和Python接口构建检测流水线仍有些繁琐,Nvidia针对最常见的深度学习模型处理流程提炼并设计了参考程序deepstream-app,该程序允许用户通过传入配置文件描述检测流水线,deepstream-app会根据配置文件的描述调用相应DS插件,构建流水线。因此,虽然deepstream-app是个参考程序,但常被当做DS的CLI工具使用。

deepstream-app的运作流程

image

前端使用decode插件读入视频流(RTSP、文件、usb摄像头等),多个摄像头经过MUX进行合并,组成batch,送入主检测器(目标检测)获得边界框,随后送入tracker进行跟踪,每个跟踪的边界框继续送入次级检测器(一般是分类器),检测结果发送到tilter形成2D 帧图像(平铺),进而用osd插件使用生成的元数据在合成帧上绘制阴影框、矩形和文本,最后输出结果(sink)。

结果输出方式如下:

  • Fakesink
  • EGL based windowed sink (nveglglessink)
  • Encode + File Save (encoder + muxer + filesink)
  • Encode + RTSP streaming
  • Overlay (Jetson only)
  • Message converter + Message broker
$ deepstream-app --help-all
Usage:
  deepstream-app [OPTION?] Nvidia DeepStream Demo

Help Options:
  -h, --help                        Show help options
  --help-all                        Show all help options
  --help-gst                        Show GStreamer Options

GStreamer Options
  --gst-version                     Print the GStreamer version
  --gst-fatal-warnings              Make all warnings fatal
  --gst-debug-help                  Print available debug categories and exit
  --gst-debug-level=LEVEL           Default debug level from 1 (only error) to 9 (anything) or 0 for no output
  --gst-debug=LIST                  Comma-separated list of category_name:level pairs to set specific levels for the individual categories. Example: GST_AUTOPLUG:5,GST_ELEMENT_*:3
  --gst-debug-no-color              Disable colored debugging output
  --gst-debug-color-mode            Changes coloring mode of the debug log. Possible modes: off, on, disable, auto, unix
  --gst-debug-disable               Disable debugging
  --gst-plugin-spew                 Enable verbose plugin loading diagnostics
  --gst-plugin-path=PATHS           Colon-separated paths containing plugins
  --gst-plugin-load=PLUGINS         Comma-separated list of plugins to preload in addition to the list stored in environment variable GST_PLUGIN_PATH
  --gst-disable-segtrap             Disable trapping of segmentation faults during plugin loading
  --gst-disable-registry-update     Disable updating the registry
  --gst-disable-registry-fork       Disable spawning a helper process while scanning the registry

Application Options:
  -v, --version                     Print DeepStreamSDK version
  -t, --tiledtext                   Display Bounding box labels in tiled mode
  --version-all                     Print DeepStreamSDK and dependencies version
  -c, --cfg-file                    Set the config file
  -i, --input-file                  Set the input file

deepstream-app的配置文件使用密钥文件格式,基于 freedesktop 规范

deepstream-app的配置文件有如下可选的配置组。

2)示例应用说明

Sample source details

Sample Configurations and Streams

Path inside sources directory Description
apps/sample_apps/deepstream-test1 Sample of how to use DeepStream elements for a single H.264 stream: filesrc → decode → nvstreammux → nvinfer (primary detector) → nvdsosd → renderer.
apps/sample_apps/deepstream-test2 Sample of how to use DeepStream elements for a single H.264 stream: filesrc → decode → nvstreammux → nvinfer (primary detector) → nvtracker → nvinfer (secondary classifier) → nvdsosd → renderer.
apps/sample_apps/deepstream-test3 Builds on deepstream-test1 (simple test application 1) to demonstrate how to:Use multiple sources in the pipelineUse a uridecodebin to accept any type of input (e.g. RTSP/File), any GStreamer supported container format, and any codecConfigure Gst-nvstreammux to generate a batch of frames and infer on it for better resource utilizationExtract the stream metadata, which contains useful information about the frames in the batched buffer
apps/sample_apps/­deepstream-test4 Builds on deepstream-test1 for a single H.264 stream: filesrc, decode, nvstreammux, nvinfer, nvdsosd, renderer to demonstrate how to:Use the Gst-nvmsgconv and Gst-nvmsgbroker plugins in the pipelineCreate NVDS_META_EVENT_MSG type metadata and attach it to the bufferUse NVDS_META_EVENT_MSG for different types of objects, e.g. vehicle and personImplement “copy” and “free” functions for use if metadata is extended through the extMsg field
apps/sample_apps/­deepstream-test5 Builds on top of deepstream-app. Demonstrates:Use of Gst-nvmsgconv and Gst-nvmsgbroker plugins in the pipeline for multistreamHow to configure Gst-nvmsgbroker plugin from the config file as a sink plugin (for KAFKA, Azure, etc.)How to handle the RTCP sender reports from RTSP servers or cameras and translate the Gst Buffer PTS to a UTC timestamp.For more details refer the RTCP Sender Report callback function test5_rtcp_sender_report_callback() registration and usage in deepstream_test5_app_main.c. GStreamer callback registration with rtpmanager element’s “handle-sync” signal is documented in apps-common/src/deepstream_source_bin.c.
libs/amqp_­protocol_adaptor Application to test AMQP protocol.
libs/azure_protocol_adaptor Test application to show Azure IoT device2edge messaging and device2cloud messaging using MQTT.
apps/sample_apps/­deepstream-app Source code for the DeepStream reference application.
sources/objectDetector_SSD Configuration files and custom library implementation for the SSD detector model.
sources/objectDetector_FasterRCNN Configuration files and custom library implementation for the FasterRCNN model.
sources/objectDetector_Yolo Configuration files and custom library implementation for the Yolo models, currently Yolo v2, v2 tiny, v3, and v3 tiny.
apps/sample_apps/deepstream-dewarper-test Demonstrates dewarper functionality for single or multiple 360-degree camera streams. Reads camera calibration parameters from a CSV file and renders aisle and spot surfaces on the display.
apps/sample_apps/deepstream-nvof-test Demonstrates optical flow functionality for single or multiple streams. This example uses two GStreamer plugins (Gst-nvof and Gst-nvofvisual). The Gst-nvof element generates the MV (motion vector) data and attaches it as user metadata. The Gst-nvofvisual element visualizes the MV data using a predefined color wheel matrix.
apps/sample_apps/deepstream-user-metadata-test Demonstrates how to add custom or user-specific metadata to any component of DeepStream. The test code attaches a 16-byte array filled with user data to the chosen component. The data is retrieved in another component.
apps/sample_apps/deepstream-image-decode-test Builds on deepstream-test3 to demonstrate image decoding instead of video. This example uses a custom decode bin so the MJPEG codec can be used as input.
apps/sample_apps/deepstream-segmentation-test Demonstrates segmentation of multi-stream video or images using a semantic or industrial neural network and rendering output to a display.
apps/sample_apps/deepstream-gst-metadata-test Demonstrates how to set metadata before the Gst-nvstreammux plugin in the DeepStream pipeline, and how to access it after Gst-nvstreammux.
apps/sample_apps/deepstream-infer-tensor-meta-app Demonstrates how to flow and access nvinfer tensor output as metadata.
apps/sample_apps/deepstream-perf-demo Performs single channel cascaded inferencing and object tracking sequentially on all streams in a directory.
apps/sample_apps/deepstream-nvdsanalytics-test Demonstrates batched analytics like ROI filtering, Line crossing, direction detection and overcrowding
apps/sample_apps/deepstream-opencv-test Demonstrates the use of OpenCV in dsexample plugin
Apps/sample_apps / deepstream-image-meta-test Demonstrates how to attach encoded image as meta data and save the images in jpeg format.
apps/sample_apps/deepstream-appsrc-test Demonstrates AppSrc and AppSink usage for consuming and giving data from non DeepStream code respectively.
apps/sample_apps/ deepstream-transfer-learning-app Demonstrates a mechanism to save the images for objects which have lesser confidence and the same can be used for training further
apps/sample_apps/ deepstream-mrcnn-test Demonstrates Instance segmentation using Mask-RCNN model
apps/sample_apps/deepstream-audio Source code for the DeepStream reference application demonstrating audio analytics pipeline.
apps/sample_apps/deepstream-testsr Demonstrates event based smart record functionality

3)sample测试

$ cd /opt/nvidia/deepstream/deepstream-5.1/sources/apps/sample_apps/deepstream-test1
$ deepstream-test1-app /opt/nvidia/deepstream/deepstream-5.1/samples/streams/sample_720p.h264

2.Python Sample Apps Source

提供有关使用 Python 开发 DeepStream 应用程序的信息。

Python 绑定包含在 DeepStream 5.1 SDK 中,示例应用程序可在此处获得。

1)Python bindings

DeepStream 通过 Python bindings支持 C/C++ 和 Python 的应用程序开发。DeepStream管道可以使用Gst-Python构建(基于GStreamer 框架的 Python bindings)。为了访问 DeepStream 元数据,Python bindings以编译模块的形式提供,该模块包含在 DeepStream SDK 中。

  • DeepStream Python API

[图片上传失败...(image-d25753-1639625107443)]

DeepStream Python 应用程序使用Gst-Python API 操作构建管道,并使用探头功能在管道中的不同点访问数据。数据类型均以native C 为母体,需要通过PyBindings或 NumPy 为垫板才能从 Python 应用程序访问它们。Tensor数据是推理后出的原始Tensor输出,如果试图检测对象,此张力数据需要通过解析和聚类算法进行后处理,以便在检测到的对象周围创建边界框。

Python bindings安装在/opt/nvidia/deepstream/deepstream/lib/pyds.so

2)运行 Sample Applications

安装依赖

$ sudo apt install python3-gi python3-dev python3-gst-1.0 -y
$ sudo apt install python3-opencv
$ sudo apt install python3-numpy
$ sudo apt install libgstrtspserver-1.0-0 gstreamer1.0-rtsp
$ sudo apt install libgirepository1.0-dev
$ sudo apt install gobject-introspection gir1.2-gst-rtsp-server-1.0

下载sample源码

$ cd /opt/nvidia/deepstream/deepstream-5.1/sources # 必须,模型使用相对路径
$ git clone https://github.com/NVIDIA-AI-IOT/deepstream_python_apps

测试sample

https://blog.csdn.net/leida_wt/article/details/113368272

$ cd /opt/nvidia/deepstream/deepstream-5.1/sources/deepstream_python_apps/apps/deepstream-test1-rtsp-out
$ python3 deepstream_test1_rtsp_out.py -i ../../../../samples/streams/sample_720p.h264
Creating Pipeline 
Creating Source 
Creating H264Parser 
Creating Decoder 
Creating H264 Encoder
Creating H264 rtppay
Playing file ../../../../samples/streams/sample_720p.h264 
Adding elements to Pipeline 
Linking elements in the Pipeline 

 *** DeepStream: Launched RTSP Streaming at rtsp://localhost:8554/ds-test ***
Starting pipeline 

0:00:01.610084345  4075      0x31e5c00 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend()  [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-5.1/samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_int8.engine
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:685 [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT input_1         3x368x640       
1   OUTPUT kFLOAT conv2d_bbox     16x23x40        
2   OUTPUT kFLOAT conv2d_cov/Sigmoid 4x23x40         

0:00:01.610191478  4075      0x31e5c00 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext()  [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-5.1/samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_int8.engine
0:00:01.611249626  4075      0x31e5c00 INFO                 nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus: [UID 1]: Load new model:dstest1_pgie_config.txt sucessfully
Frame Number=0 Number of Objects=6 Vehicle_count=4 Person_count=2
Frame Number=1 Number of Objects=6 Vehicle_count=4 Person_count=2
...
Frame Number=1441 Number of Objects=0 Vehicle_count=0 Person_count=0
End-of-stream

# 运行期间使用vlc等播放器访问地址
rtsp://:8554/ds-test
 
 $ nvidia-smi                                                                                                                                                                                             
+-----------------------------------------------------------------------------+                                                                                                
| NVIDIA-SMI 455.32.00    Driver Version: 455.32.00    CUDA Version: 11.1     |
|-------------------------------+----------------------+----------------------+                                                                                                
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |                                                                                                
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000000:5E:00.0 Off |                  Off |                                                                                                
| N/A   38C    P0    27W /  70W |    834MiB / 16127MiB |     10%      Default |                                                                                                
|                               |                      |                  N/A |                                                                                                
+-------------------------------+----------------------+----------------------+                                                                                                

+-----------------------------------------------------------------------------+                                                                                                
| Processes:                                                                  |                                                                                                
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |                                                                                                
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      5603      C   python3                           703MiB |
+-----------------------------------------------------------------------------+

3)示例应用说明

Path inside the GitHub repo Description
apps/­deepstream-test1 Simple example of how to use DeepStream elements for a single H.264 stream: filesrc → decode → nvstreammux → nvinfer (primary detector) → nvdsosd → renderer.
apps/­deepstream-test2 Simple example of how to use DeepStream elements for a single H.264 stream: filesrc → decode → nvstreammux → nvinfer (primary detector) → nvtracker → nvinfer (secondary classifier) → nvdsosd → renderer.
apps/­deepstream-test3 Builds on deepstream-test1 (simple test application 1) to demonstrate how to:Use multiple sources in the pipelineUse a uridecodebin to accept any type of input (e.g. RTSP/File), any GStreamer supported container format, and any codecConfigure Gst-nvstreammux to generate a batch of frames and infer on it for better resource utilizationExtract the stream metadata, which contains useful information about the frames in the batched buffer
apps/­deepstream-test4 Builds on deepstream-test1 for a single H.264 stream: filesrc, decode, nvstreammux, nvinfer, nvdsosd, renderer to demonstrate how to:Use the Gst-nvmsgconv and Gst-nvmsgbroker plugins in the pipelineCreate NVDS_META_EVENT_MSG type metadata and attach it to the bufferUse NVDS_META_EVENT_MSG for different types of objects, e.g. vehicle and personImplement “copy” and “free” functions for use if metadata is extended through the extMsg field
apps/­deepstream-test1-usbcam Simple test application 1 modified to process a single stream from a USB camera.
apps/­deepstream-test1-rtsp-out Simple test application 1 modified to output visualization stream over RTSP.
apps/­deepstream-imagedata-multistream Builds on simple test application 3 to demonstrate how to:Access decoded frames as NumPy arrays in the pipelineCheck detection confidence of detected objects (DBSCAN or NMS clustering required)Modify frames and see the changes reflected downstream in the pipelineUse OpenCV to annotate the frames and save them to file
apps/­deepstream-ssd-parser Demonstrates how to perform custom post-processing for inference output from Triton Inference Server:Use SSD model on Triton Inference Server for object detectionEnable custom post-processing and raw tensor export for Triton Inference Server via configuration file settingsAccess inference output tensors in the pipeline for post-processing in PythonAdd detected objects to the metadataOutput the OSD visualization to MP4 file
apps/deepstream-opticalflow Demonstrated how to obtain opticalflow meta data and also demonstrates how to:Access optical flow vectors as numpy arrayVisualize optical flow using obtained flow vectors and OpenCV
apps/deepstream-segmentation Demonstrates how to obtain segmentation meta data and also demonstrates how to:Acess segmentation masks as numpy arrayVisualize segmentation using obtained masks and OpenCV
apps/deepstream-nvdsanalytics Demonstrates how to use the nvdsanalytics plugin and obtain analytics metadata

3.TLT pre-trained models

https://docs.nvidia.com/tao/tao-toolkit/text/deepstream_tao_integration.html

用户可以从两种类型的预训练模型开始:

  • Purpose-built pre-trained models:这些是高度准确的模型,经过针对特定任务的数千个数据输入进行训练。这些以领域为中心的模型既可以直接用于推理,也可以与 TAO 工具包一起用于对用户自己的数据集进行迁移学习。
  • General purpose vision models.:这些模型的预训练权重仅作为构建更复杂模型的起点。对于计算机视觉用例,这些预训练的权重在开放图像数据集上进行训练,与从权重的随机初始化开始相比,它们提供了更好的训练起点。

要将 TAO Toolkit 训练的模型部署到 DeepStream,我们有两种选择:

  • 选项 1:将.etlt模型直接集成到 DeepStream 应用程序中。模型文件通过导出生成。
    • 直接使用.etlt文件和校准缓存,DeepStream 会自动生成 TensorRT 引擎文件,然后运行推理。
  • 选项 2:使用tao-converter. 生成的 TensorRT 引擎文件也可以被 DeepStream 摄取。
image

1)Purpose-built pre-trained models

Model Name Model arch Model output format Prunable INT8 Compatible with DS5.1 TRT-OSS required
PeopleNet DetectNet_v2 Encrypted UFF Yes Yes Yes No
TrafficCamNet DetectNet_v2 Encrypted UFF Yes Yes Yes No
DashCamNet DetectNet_v2 Encrypted UFF Yes Yes Yes No
FaceDetect-IR DetectNet_v2 Encrypted UFF Yes Yes Yes No
FaceDetect DetectNet_v2 Encrypted UFF Yes Yes Yes No
VehicleMakeNet Image Classification Encrypted UFF Yes Yes Yes No
VehicleTypeNet Image Classification Encrypted UFF Yes Yes Yes No
LPDNet DetectNet_v2 Encrypted UFF Yes Yes Yes No
LPRNet Character Recognition Encrypted ONNX No Yes Yes No
PeopleSegNet MaskRCNN Encrypted UFF No Yes Yes Yes
PeopleSemSegNet UNET Encrypted ONNX No Yes Yes Yes

性能表现

JetsonNano JetsonTx2 JetsonXavier NX JetsonAGXXavier T4 A100PCIe
Model Arch Inference resolution Precision GPU(FPS) GPU(FPS) GPU(FPS) DLA1 (FPS) DLA2 (FPS) GPU (FPS) DLA1 (FPS) DLA2 (FPS) GPU (FPS) GPU (FPS)
PeopleNet- ResNet34 960x544 INT8 10.7 28 168 54 54 292 70 70 890 3392
PeopleNet – ResNet18 960x544 INT8 13.9 35 218 72 72 395 97 97 1086 3841
TrafficCamNet – ResNet18 960x544 INT8 19.5 52 264 105 105 478 140 140 1358 4013
DashCamNet – ResNet18 960x544 INT8 17.8 46 254 100 100 453 133 133 1320 3993
FaceDetectIR- ResNet18 384x240 INT8 101 275 1192 553 553 2010 754 754 2568 5549

示例1:运行PeopleNet模型

# 下载加密的tlt模型文件
cd /opt/nvidia/deepstream/deepstream-5.1/samples/models/
mkdir -p tlt_pretrained_models/peoplenet
cd tlt_pretrained_models/peoplenet
wget https://api.ngc.nvidia.com/v2/models/nvidia/tao/peoplenet/versions/pruned_v2.1/files/resnet34_peoplenet_pruned.etlt

# 运行模型
cd /opt/nvidia/deepstream/deepstream-5.1/samples/configs/tlt_pretrained_models/
sudo deepstream-app -c deepstream_app_source1_peoplenet.txt

注意:首次执行时,由于primary-gie配置的model-engine-file不存在,deepstream-app运行时会自动将其etlt文件转换为 TensorRT engine 文件 (.engine) 。

示例2:License Plate detection and recognition

License Plate Detection (LPDNet) and Recognition (LPRNet)

DeepStream5.0系列之车牌识别

image
cd /opt/nvidia/deepstream/deepstream-5.1/sources/
# 下载测试程序
git clone https://github.com/NVIDIA-AI-IOT/deepstream_lpr_app.git
cd deepstream_lpr_app/

# 下载中文车牌模型并转变为 TensorRT engine 文件
./download_ch.sh
# gst-nvinfer cannot generate TRT engine for LPR model, so generate it with tao-converter
tao-converter -k nvidia_tlt -p image_input,1x3x48x96,4x3x48x96,16x3x48x96 \
models/LP/LPR/ch_lprnet_baseline18_deployable.etlt -t fp16 -e models/LP/LPR/lpr_ch_onnx_b16.engine

# 编译程序与插件
make
cd deepstream-lpr-app
cp dict_ch.txt dict.txt
# 运行测试程序
sudo ./deepstream-lpr-app 2 1 0 ch_car_test.mp4 ch_car_test.mp4 output.264

2)General purpose vision models

https://docs.nvidia.com/tao/tao-toolkit/text/overview.html

基于general purpose models训练而来的Open Model Architectures (TAO models)。

ImageClassification Object Detection Instance Segmentation Semantic Segmentation
Backbone DetectNet_V2 FasterRCNN SSD YOLOv3 RetinaNet DSSD YOLOv4 MaskRCNN UNet
ResNet10/18/34/50/101 Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
VGG 16/19 Yes Yes Yes Yes Yes Yes Yes Yes Yes
GoogLeNet Yes Yes Yes Yes Yes Yes Yes Yes
MobileNet V1/V2 Yes Yes Yes Yes Yes Yes Yes Yes
SqueezeNet Yes Yes Yes Yes Yes Yes Yes
DarkNet 19/53 Yes Yes Yes Yes Yes Yes Yes Yes
CSPDarkNet 19/53 Yes Yes
Efficientnet B0 Yes Yes Yes Yes Yes Yes Yes
Efficientnet B1* Yes Yes

性能表现

Jetson Nano Jetson Xavier NX Jetson AGX Xavier T4
Model Arch Inference resolution Precision GPU (FPS) GPU (FPS) DLA1 (FPS) DLA2 (FPS) GPU (FPS) DLA1 (FPS) DLA2 (FPS) GPU (FPS)
YoloV3 – ResNet18 960x544 INT8 11 78 55 55 223 84 84 620
FasterRCNN – ResNet10 480x272 INT8 16 127 N/A N/A 281 N/A N/A 932
SSD – ResNet18 960x544 INT8 10.6 124 56 56 216 77 77 760
DSSD – ResNet18 960x544 INT8 9 66 45 45 189 67 67 586
RetinaNet – ResNet18 960x544 INT8 8.5 60 45 45 147 41 41 296
MaskRCNN – ResNet50 1344x832 INT8 0.6 5.4 3.2 3.2 9.2 4.5 4.5 24

The following TAO models can be integrated into DeepStream 5.1:

  • Image Classification
  • Object Detection
    • Yolo V3
    • Yolo V4
    • DSSD
    • SSD
    • RetinaNet
    • DetectNet_v2
    • FasterRCNN
  • Instance Segmentation
    • Mask-RCNN
  • Semantic Segmentation
    • UNet
  • Character Recognition
  • MultiTask Classification

测试APP及模型

sudo su
cd /opt/nvidia/deepstream/deepstream-5.1/sources/deepstream_tao_apps
git clone -b release/tao3.0 https://github.com/NVIDIA-AI-IOT/deepstream_tao_apps.git

# 编译源码
export CUDA_VER=11.1
make

# 下载模型
./download_models.sh
tree apps/ -L 1
apps/
├── Makefile
├── tao_classifier
├── tao_detection
├── tao_others
└── tao_segmentation
tree models/ -L 1
    models/
    ├── dssd
    ├── emotion
    ├── faciallandmark
    ├── frcnn
    ├── gazenet
    ├── heartrate
    ├── peopleSegNet
    ├── peopleSemSegNet
    ├── retinanet
    ├── ssd
    ├── unet
    ├── yolov3
    └── yolov4
# 对于Unet/peopleSemSegNet/yolov3/yolov4 模型不支持deepstream直接装换,需要用tao-convert将.etlt文件转为.engine文件
tao-converter -e models/unet/unet_resnet18.etlt_b1_gpu0_fp16.engine -p input_1,1x3x608x960,1x3x608x960,1x3x608x960 -t fp16 -k tlt_encode -m 1 tlt_encode models/unet/unet_resnet18.etlt

# 测试app与模型
./apps/tao_segmentation/ds-tao-segmentation -h
        Usage: ./apps/tao_segmentation/ds-tao-segmentation -c pgie_config_file -i  [-b BATCH] [-d]
-h: print help info 
-c: pgie config file, e.g. pgie_frcnn_tao_config.txt  
-i: H264 or JPEG input file  
-b: batch size, this will override the value of "batch-size" in pgie config file  
-d: enable display, otherwise dump to output H264 or JPEG file
        
 ./apps/tao_segmentation/ds-tao-segmentation -c configs/unet_tao/pgie_unet_tao_config.txt -i ../../samples/streams/sample_720p.h264
 
  ./apps/tao_classifier/ds-tao-classifier -c configs/multi_task_tao/pgie_multi_task_tao_config.txt -i ../../samples/streams/sample_720p.h264
 
 SHOW_MASK=1
 ./apps/tao_detection/ds-tao-detection  -c configs/frcnn_tao/pgie_frcnn_tao_config.txt -i ../../samples/streams/sample_720p.h264

备注:模型转换命令和调用参考可以在NGC上查询相关模型信息获得。

四、Docker方式部署与使用

dGPU DeepStream容器列表,来源 https://ngc.nvidia.com

Container Container pull commands
base docker (contains only the runtime libraries and GStreamer plugins. Can be used as a base to build custom dockers for DeepStream applications) docker pull nvcr.io/nvidia/deepstream:5.1-21.02-base
devel docker (contains the entire SDK along with a development environment for building DeepStream applications) docker pull nvcr.io/nvidia/deepstream:5.1-21.02-devel
Triton Inference Server docker with Triton Inference Server and dependencies installed along with a development environment for building DeepStream applications docker pull nvcr.io/nvidia/deepstream:5.1-21.02-triton
DeepStream IoT docker with deepstream-test5-app installed and all other reference applications removed docker pull nvcr.io/nvidia/deepstream:5.1-21.02-iot
DeepStream samples docker (contains the runtime libraries, GStreamer plugins, reference applications and sample streams, models and configs) docker pull nvcr.io/nvidia/deepstream:5.1-21.02-samples

使用示例:容器运行PeopleNet模型

mkdir -p $HOME/peoplenet && \
wget https://api.ngc.nvidia.com/v2/models/nvidia/tao/peoplenet/versions/pruned_v2.1/files/resnet34_peoplenet_pruned.etlt \
-O $HOME/peoplenet/resnet34_peoplenet_pruned.etlt

xhost +
docker run --gpus all -it --rm -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=$DISPLAY -v $HOME:/opt/nvidia/deepstream/deepstream-5.1/samples/models/tlt_pretrained_models \
-w /opt/nvidia/deepstream/deepstream-5.1/samples/configs/tlt_pretrained_models nvcr.io/nvidia/deepstream:5.1-21.02-samples \
deepstream-app -c deepstream_app_source1_peoplenet.txt

    # xhost 开放宿主机图形界面的接入权限
    –gpus all 指定容器可见的GPU
    –device=/dev/video0 将摄像头1映射进入容器
    -it -p 8554:8554 映射RTSPStreaming RTSP端口(可选)
    -p 5400:5400/udp 映射RTSPStreaming UDP端口(可选)
    -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=:0 连接图形界面到宿主机
    -v $HOME:/opt/nvidia/deepstream/deepstream-5.1/samples/models/tlt_pretrained_models  映射宿主机文件夹
    -w opt/nvidia/deepstream/deepstream-5.1/samples/configs/tlt_pretrained_models 设置进入容器后开启的路径

五、重要插件说明

1.Gst-nvinfer

image

The plugin accepts batched NV12/RGBA buffers from upstream.

image

1)输入与输出

  • Inputs

    • Gst Buffer
    • NvDsBatchMeta (attaching NvDsFrameMeta)
    • Caffe Model and Caffe Prototxt
    • ONNX
    • UFF file
    • TLT Encoded Model and Key
    • Offline: Supports engine files generated by Transfer Learning Toolkit SDK Model converters
    • Layers: Supports all layers supported by TensorRT, see: https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html.
  • Control parameters

    Gst-nvinfer gets control parameters from a configuration file. You can specify this by setting the property config-file-path. For details, see Gst-nvinfer File Configuration Specifications. Other control parameters that can be set through GObject properties are:

    • Batch size
    • Inference interval
    • Attach inference tensor outputs as buffer metadata
    • Attach instance mask output as in object metadata
    • The parameters set through the GObject properties override the parameters in the Gst-nvinfer configuration file.
  • Outputs

    • Gst Buffer
    • Depending on network type and configured parameters, one or more of:
    • NvDsObjectMeta
    • NvDsClassifierMeta
    • NvDsInferSegmentationMeta
    • NvDsInferTensorMeta

2)Gst-nvinfer File Configuration Specifications

The Gst-nvinfer configuration file uses a “Key File” format.

https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_plugin_gst-nvinfer.html#inputs-and-outputs

The configuration parameters that you must specify include:

  • model-file (Caffe model)
  • proto-file (Caffe model)
  • uff-file (UFF models)
  • onnx-file (ONNX models)
  • model-engine-file, if already generated
  • int8-calib-file for INT8 mode
  • mean-file, if required
  • offsets, if required
  • maintain-aspect-ratio, if required
  • parse-bbox-func-name (detectors only) // 对应自定义边界框函数
  • parse-classifier-func-name (classifiers only) // 对应自定义边界框函数
  • custom-lib-path // 自定义边界框解析动态库
  • output-blob-names (Caffe and UFF models)
  • network-type
  • model-color-format
  • process-mode
  • engine-create-func-name
  • infer-dims (UFF models)
  • uff-input-order (UFF models)

2.Gst-nvmsgconv 与 Gst-nvmsgbroker

1)Gst-nvmsgconv

The Gst-nvmsgconv plugin parses NVDS_EVENT_MSG_META (NvDsEventMsgMeta) type metadata attached to the buffer as user metadata of frame meta and generates the schema payload ( full or minimal).

  • Full DeepStream schema: generate the payload in JSON format, supports elaborate semantics for object detection, analytics modules, events, location, and sensor.
image

Inputs and Outputs

  • Inputs
    • Gst Buffer with NvDsEventMsgMeta
  • Control parameters
    • config
    • msg2p-lib
    • payload-type
    • comp-id
  • Output
    • Same Gst Buffer with additional NvDsPayload metadata. This metadata contains information about the payload generated by the plugin.

2)Gst-nvmsgbroker

This plugin sends payload messages to the server using a specified communication protocol.

It accepts any buffer that has NvDsPayload metadata attached and uses the nvds_msgapi_* interface to send the messages to the server. You must implement the nvds_msgapi_* interface for the protocol to be used and specify the implementing library in the proto-lib property.

image

Inputs and Outputs

  • Inputs
    • Gst Buffer with NvDsPayload
  • Control parameters
    • Config
    • conn-str
    • proto-lib
    • comp-id
    • topic
  • Output
    • None, as this is a sink component

Features

  • Payload in JSON format
  • Kafka protocol support
  • Azure IOT support
  • AMQP support
  • REDIS support
  • Custom protocol support

3)示例:通过AMQP协议输出

  • 安装依赖

    git clone -b v0.8.0  --recursive https://github.com/alanxz/rabbitmq-c.git
    cd rabbitmq-c/
    mkdir build && cd build
    cmake ..
    cmake --build .
    
    sudo cp ./librabbitmq/librabbitmq.so.4 /usr/lib/
    
    sudo apt-get install libglib2.0 libglib2.0-dev
    
  • 运行AMQP broker(rabbitmq)

    通过容器方式运行,启动容器后,可以浏览器中访问http://localhost:15672来查看控制台信息。

    • RabbitMQ默认的用户名:guest,密码:guest
    $ sudo docker pull rabbitmq:management
    $ sudo docker run --name rabbitmq -d -p 15672:15672 -p 5672:5672 rabbitmq:management
      # d 后台运行容器;
      # --name 指定容器名;
      # -p 指定服务运行的端口(5672:应用访问端口;15672:控制台Web端口号);
    
  • 运行示例程序deepstream-test4

    $ cd /opt/nvidia/deepstream/deepstream-5.1/sources/apps/sample_apps/deepstream-test4
    $ deepstream-test4-app -h
    Usage:
      deepstream-test4-app [OPTION?] Nvidia DeepStream Test4
    
    Help Options:
      -h, --help                        Show help options
      --help-all                        Show all help options
      --help-gst                        Show GStreamer Options
    
    Application Options:
      -c, --cfg-file                    Set the adaptor config file. Optional if connection string has relevant  details.
      -i, --input-file                  Set the input H264 file
      -t, --topic                       Name of message topic. Optional if it is part of connection string or config file.
      --conn-str                        Connection string of backend server. Optional if it is part of config file.
      -p, --proto-lib                   Absolute path of adaptor library
      -s, --schema                      Type of message schema (0=Full, 1=minimal), default=0
      --no-display                      Disable display
    
    $ deepstream-test4-app -i /opt/nvidia/deepstream/deepstream-5.1/samples/streams/sample_720p.h264 -p /opt/nvidia/deepstream/deepstream/lib/libnvds_amqp_proto.so -c cfg_amqp.txt -t dstopic -s 1 --no-display
      # 指定示例topic name为dstopic
      # -s 指定数据格式 
    

    配置文件cfg_amqp.txt

    • 示例程序作为生产端输出topic
    [message-broker]
    password = guest
    hostname = localhost
    username = guest
    port = 5672
    exchange = amq.topic
    topic = topicname
    #share-connection = 1
    
  • 在rabbitmq创建queue ”dsqueue“并关联exchange ”amq.topic“

  • 编写消费端程序

    rabbitmq-consumer

    // 1.import the library
    package main
    import (
            "log"
            "github.com/streadway/amqp"
    )
    
    func failOnError(err error, msg string) {
            if err != nil {
                    log.Fatalf("%s: %s", msg, err)
            }
    }
    
    func main() {
        // 2.connect to RabbitMQ server
            conn, err := amqp.Dial("amqp://guest:[email protected]:5672/")
            failOnError(err, "Failed to connect to RabbitMQ")
            defer conn.Close()
    
        // 3.create a channel
            ch, err := conn.Channel()
            failOnError(err, "Failed to open a channel")
            defer ch.Close()
    
        //5. 接收(消费)队列消息
            msgs, err := ch.Consume(
                    "dsqueue", // queue
                    "",     // consumer
                    true,   // auto-ack
                    false,  // exclusive
                    false,  // no-local
                    false,  // no-wait
                    nil,    // args
            )
            failOnError(err, "Failed to register a consumer")
    
            forever := make(chan bool)
    
            go func() {
            //6. 打印队列消息的内容
                    for d := range msgs {
                            log.Printf("Received a message: %s", d.Body)
                    }
            }()
    
            log.Printf(" [*] Waiting for messages. To exit press CTRL+C")
            <-forever
    }
    

    运行程序

    $ ./rabbitmq-consumer 
    # full schema 
    2021/10/22 16:47:34 Received a message: {
      "messageid" : "a6b0fe29-bc4c-4c1f-b10c-9425eaeaf452",
      "mdsversion" : "1.0",
      "@timestamp" : "2021-10-22T08:45:59.281Z",
      "place" : {
        "id" : "1",
        "name" : "XYZ",
        "type" : "garage",
        "location" : {
          "lat" : 30.32,
          "lon" : -40.549999999999997,
          "alt" : 100.0
        },
        "aisle" : {
          "id" : "walsh",
          "name" : "lane1",
          "level" : "P2",
          "coordinate" : {
            "x" : 1.0,
            "y" : 2.0,
            "z" : 3.0
          }
        }
      },
      "sensor" : {
        "id" : "CAMERA_ID",
        "type" : "Camera",
        "description" : "\"Entrance of Garage Right Lane\"",
        "location" : {
          "lat" : 45.293701446999997,
          "lon" : -75.830391449900006,
          "alt" : 48.155747933800001
        },
        "coordinate" : {
          "x" : 5.2000000000000002,
          "y" : 10.1,
          "z" : 11.199999999999999
        }
      },
      "analyticsModule" : {
        "id" : "XYZ",
        "description" : "\"Vehicle Detection and License Plate Recognition\"",
        "source" : "OpenALR",
        "version" : "1.0"
      },
      "object" : {
        "id" : "-1",
        "speed" : 0.0,
        "direction" : 0.0,
        "orientation" : 0.0,
        "vehicle" : {
          "type" : "sedan",
          "make" : "Bugatti",
          "model" : "M",
          "color" : "blue",
          "licenseState" : "CA",
          "license" : "XX1234",
          "confidence" : -0.10000000149011612
        },
        "bbox" : {
          "topleftx" : 1173,
          "toplefty" : 481,
          "bottomrightx" : 1227,
          "bottomrighty" : 504
        },
        "location" : {
          "lat" : 0.0,
          "lon" : 0.0,
          "alt" : 0.0
        },
        "coordinate" : {
          "x" : 0.0,
          "y" : 0.0,
          "z" : 0.0
        }
      },
      "event" : {
        "id" : "72931242-c039-4f90-b786-bdb296863c23",
        "type" : "moving"
      },
      "videoPath" : ""
    }
    
    # minimal schema
    2021/10/22 16:56:13 Received a message: {
      "version" : "4.0",
      "id" : 180,
      "@timestamp" : "2021-10-22T08:56:13.744Z",
      "sensorId" : "sensor-0",
      "objects" : [
        "-1|1176|475.435|1260|516.522|Vehicle|#|sedan|Bugatti|M|blue|XX1234|CA|-0.1"
      ]
    }
    

3.使用自定义AI模型

https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_custom_YOLO.html?highlight=yolov3

自定义AI模型需要实现自定义边界框解析函数(bounding box parser function),否则运行模型会报错。

0:00:04.191794733  7711 0x5611f74a9b20 ERROR                nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::parseBoundingBox()  [UID = 1]: Could not find output coverage layer for parsing objects
0:00:04.191854933  7711 0x5611f74a9b20 ERROR                nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::fillDetectionOutput()  [UID = 1]: Failed to parse bboxes
Segmentation fault (core dumped)

Deepstream目前内置以下output parsers,除此之外需要自定义实现:

  • FasterRCNN
  • MaskRCNN
  • SSD
  • YoloV3 / YoloV3Tiny / YoloV2 / YoloV2Tiny
  • DetectNet

1)nvdsinfer_customparser示例程序

DeepStream supports NVIDIA® TensorRT™ plugins for custom layers.

位置:/opt/nvidia/deepstream/deepstream-5.1/sources/libs/nvdsinfer_customparser

ls nvdsinfer_customparser
Makefile  nvdsinfer_custombboxparser.cpp  nvdsinfer_customclassifierparser.cpp  README

nvdsinfer_custombboxparser.cpp

#include 
#include 
#include "nvdsinfer_custom_impl.h"
#include 

#define MIN(a,b) ((a) < (b) ? (a) : (b))
#define MAX(a,b) ((a) > (b) ? (a) : (b))
#define CLIP(a,min,max) (MAX(MIN(a, max), min))
#define DIVIDE_AND_ROUND_UP(a, b) ((a + b - 1) / b)

struct MrcnnRawDetection {
    float y1, x1, y2, x2, class_id, score;
};

/* This is a sample bounding box parsing function for the sample Resnet10
 * detector model provided with the SDK. */

/* C-linkage to prevent name-mangling */
extern "C"
bool NvDsInferParseCustomResnet (std::vector const &outputLayersInfo,
        NvDsInferNetworkInfo  const &networkInfo,
        NvDsInferParseDetectionParams const &detectionParams,
        std::vector &objectList);


extern "C"
bool NvDsInferParseCustomResnet (std::vector const &outputLayersInfo,
        NvDsInferNetworkInfo  const &networkInfo,
        NvDsInferParseDetectionParams const &detectionParams,
        std::vector &objectList)
{
  static NvDsInferDimsCHW covLayerDims;
  static NvDsInferDimsCHW bboxLayerDims;
  static int bboxLayerIndex = -1;
  static int covLayerIndex = -1;
  static bool classMismatchWarn = false;
  int numClassesToParse;

  /* Find the bbox layer */
  if (bboxLayerIndex == -1) {
    for (unsigned int i = 0; i < outputLayersInfo.size(); i++) {
      if (strcmp(outputLayersInfo[i].layerName, "conv2d_bbox") == 0) {
        bboxLayerIndex = i;
        getDimsCHWFromDims(bboxLayerDims, outputLayersInfo[i].inferDims);
        break;
      }
    }
    if (bboxLayerIndex == -1) {
    std::cerr << "Could not find bbox layer buffer while parsing" << std::endl;
    return false;
    }
  }

  /* Find the cov layer */
  if (covLayerIndex == -1) {
    for (unsigned int i = 0; i < outputLayersInfo.size(); i++) {
      if (strcmp(outputLayersInfo[i].layerName, "conv2d_cov/Sigmoid") == 0) {
        covLayerIndex = i;
        getDimsCHWFromDims(covLayerDims, outputLayersInfo[i].inferDims);
        break;
      }
    }
    if (covLayerIndex == -1) {
    std::cerr << "Could not find bbox layer buffer while parsing" << std::endl;
    return false;
    }
  }

  /* Warn in case of mismatch in number of classes */
  if (!classMismatchWarn) {
    if (covLayerDims.c != detectionParams.numClassesConfigured) {
      std::cerr << "WARNING: Num classes mismatch. Configured:" <<
        detectionParams.numClassesConfigured << ", detected by network: " <<
        covLayerDims.c << std::endl;
    }
    classMismatchWarn = true;
  }

  /* Calculate the number of classes to parse */
  numClassesToParse = MIN (covLayerDims.c, detectionParams.numClassesConfigured);

  int gridW = covLayerDims.w;
  int gridH = covLayerDims.h;
  int gridSize = gridW * gridH;
  float gcCentersX[gridW];
  float gcCentersY[gridH];
  float bboxNormX = 35.0;
  float bboxNormY = 35.0;
  float *outputCovBuf = (float *) outputLayersInfo[covLayerIndex].buffer;
  float *outputBboxBuf = (float *) outputLayersInfo[bboxLayerIndex].buffer;
  int strideX = DIVIDE_AND_ROUND_UP(networkInfo.width, bboxLayerDims.w);
  int strideY = DIVIDE_AND_ROUND_UP(networkInfo.height, bboxLayerDims.h);

  for (int i = 0; i < gridW; i++)
  {
    gcCentersX[i] = (float)(i * strideX + 0.5);
    gcCentersX[i] /= (float)bboxNormX;

  }
  for (int i = 0; i < gridH; i++)
  {
    gcCentersY[i] = (float)(i * strideY + 0.5);
    gcCentersY[i] /= (float)bboxNormY;

  }

  for (int c = 0; c < numClassesToParse; c++)
  {
    float *outputX1 = outputBboxBuf + (c * 4 * bboxLayerDims.h * bboxLayerDims.w);

    float *outputY1 = outputX1 + gridSize;
    float *outputX2 = outputY1 + gridSize;
    float *outputY2 = outputX2 + gridSize;

    float threshold = detectionParams.perClassPreclusterThreshold[c];
    for (int h = 0; h < gridH; h++)
    {
      for (int w = 0; w < gridW; w++)
      {
        int i = w + h * gridW;
        if (outputCovBuf[c * gridSize + i] >= threshold)
        {
          NvDsInferObjectDetectionInfo object;
          float rectX1f, rectY1f, rectX2f, rectY2f;

          rectX1f = (outputX1[w + h * gridW] - gcCentersX[w]) * -bboxNormX;
          rectY1f = (outputY1[w + h * gridW] - gcCentersY[h]) * -bboxNormY;
          rectX2f = (outputX2[w + h * gridW] + gcCentersX[w]) * bboxNormX;
          rectY2f = (outputY2[w + h * gridW] + gcCentersY[h]) * bboxNormY;

          object.classId = c;
          object.detectionConfidence = outputCovBuf[c * gridSize + i];

          /* Clip object box co-ordinates to network resolution */
          object.left = CLIP(rectX1f, 0, networkInfo.width - 1);
          object.top = CLIP(rectY1f, 0, networkInfo.height - 1);
          object.width = CLIP(rectX2f, 0, networkInfo.width - 1) -
                             object.left + 1;
          object.height = CLIP(rectY2f, 0, networkInfo.height - 1) -
                             object.top + 1;

          objectList.push_back(object);
        }
      }
    }
  }
  return true;
}


CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseCustomResnet);

2)修改模型的nvinfer配置文件

# For resnet10 detector
parse-bbox-func-name=NvDsInferParseCustomResnet
custom-lib-path=/path/to/this/directory/libnvds_infercustomparser.so

# For resnet18 vehicle type classifier
parse-classifier-func-name=NvDsInferClassiferParseCustomSoftmax
custom-lib-path=/path/to/this/directory/libnvds_infercustomparser.so

# For Tensorflow/Onnx SSD detector within nvinferserver
infer_config {
  postprocess { detection {
      custom_parse_bbox_func: "NvDsInferParseCustomTfSSD"
      ...
  } }
  ...
  custom_lib {
    path: "/path/to/this/directory/libnvds_infercustomparser.so"
  }
}

其他

NVIDIA DeepStream SDK Developer Guide

Deepstream SDK FAQ

你可能感兴趣的:(DeepStream使用说明(基于T4卡))