目录
一.环境配置
1.1Tensorrt安装:
1.2 Opencv安装
二.C++ FastReID-TensorRT
2.1模型转换
2.2修改config文件
2.3建立第三方库
2.4Build fastrt execute file
2.5 Run
官方手册:Installation Guide :: NVIDIA Deep Learning TensorRT Documentation
选择tar安装
version="8.x.x.x" arch=$(uname -m) cuda="cuda-x.x" cudnn="cudnn8.x" tar xzvf TensorRT-${version}.Linux.${arch}-gnu.${cuda}.${cudnn}.tar.gzWhere:
ls TensorRT-${version} bin data doc graphsurgeon include lib onnx_graphsurgeon python samples targets TensorRT-Release-Notes.pdf uff
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:
cd TensorRT-${version}/python python3 -m pip install tensorrt-*-cp3x-none-linux_x86_64.whl
cd TensorRT-${version}/uff python3 -m pip install uff-0.6.9-py2.py3-none-any.whlCheck the installation with:
which convert-to-uff
cd TensorRT-${version}/graphsurgeon python3 -m pip install graphsurgeon-0.4.5-py2.py3-none-any.whl
cd TensorRT-${version}/onnx_graphsurgeon python3 -m pip install onnx_graphsurgeon-0.3.12-py2.py3-none-any.whl
由.pth到.wts,FastReID提供了代码
python projects/FastRT/tools/gen_wts.py --config-file='config/you/use/in/fastreid/xxx.yml' \
--verify --show_model --wts_path='outputs/trt_model_file/xxx.wts' \
MODEL.WEIGHTS '/path/to/checkpoint_file/model_best.pth' MODEL.DEVICE "cuda:0"
之后将.wts移动到/FastRT目录下:
根据你的模型backbone配置(sbs_R50-ibn或kd-r34-r101_ibn等等),根据其提供的文档进行修改。
https://github.com/JDAI-CV/fast-reid/tree/master/projects/FastRT#ConfigSection
下面是kd-r34-r101_ibn的例子:
对于模型的位置,batchsize,inputsize,输出特征维度,device编号,模型的backbone。head等等进行配置。
static const std::string WEIGHTS_PATH = "../kd_r34_distill.wts";
static const std::string ENGINE_PATH = "./kd_r34_distill.engine";
static const int MAX_BATCH_SIZE = 4;
static const int INPUT_H = 384;
static const int INPUT_W = 128;
static const int OUTPUT_SIZE = 512;
static const int DEVICE_ID = 0;
static const FastreidBackboneType BACKBONE = FastreidBackboneType::r34_distill;
static const FastreidHeadType HEAD = FastreidHeadType::EmbeddingHead;
static const FastreidPoolingType HEAD_POOLING = FastreidPoolingType::gempoolP;
static const int LAST_STRIDE = 1;
static const bool WITH_IBNA = false;
static const bool WITH_NL = false;
static const int EMBEDDING_DIM = 0;
主要是cnp库,来读写numpy
cd third_party/cnpy
cmake -DCMAKE_INSTALL_PREFIX=../../libs/cnpy -DENABLE_STATIC=OFF . && make -j4 && make install
mkdir build
cd build
cmake -DBUILD_FASTRT_ENGINE=ON \
-DBUILD_DEMO=ON \
-DUSE_CNUMPY=ON ..
make
这一步在make时如果报错
fatal error: NvInfer.h No such file or directory
#include "NvInfer.h"
^~~~~~~~~~~
compilation terminated.
我们可以看到在demo/cmakelist.txt里关于tensorrt库和头文件的配置为:
include_directories(/usr/include/x86_64-linux-gnu/)
link_directories(/usr/lib/x86_64-linux-gnu/)
说明我们没有将TensorRT 的库和头文件添加到系统路径下,需要:
# 在TensorRT的路径下
sudo cp -r ./lib/* /usr/lib
sudo cp -r ./include/* /usr/include
或者直接在cmakelist.txt中添加tenorrt库和头文件的绝对路径:
include_directories(/.../TensorRT-7.2.3.4/include/)
link_directories(/.../TensorRT-7.2.3.4/lib/)
问题解决
./demo/fastrt -s // 序列化模型,并生成.engine文件
./demo/fastrt -d // 反序列化engine文件,并运行
在反序列化engine文件时报错:
[E] [TRT] INVALID_CONFIG: The engine plan file is generated on an incompatible device, expecting compute 7.5 got compute 8.6, please rebuild.
说明生成engine文件的显卡与解译的显卡型号不同,算力不匹配,需要保持一致,便可解决