安装tensorrt的前提是已经安装好了cuda,需要根据cuda的版本去选择tensorrt的版本。
由于楼主是要转换带有时间维度的数据(1,8,3,320,320),在使用tensorrt7.1.3转换时,会提示以下错误,应该是版本不支持的原因
Assertion failed: convertOnnxPadding(onnxPadding, &begPadding, &endPadding) && "This version of TensorRT only supports padding on the outer two dimensions!"
于是转换tensorrt的版本,先改成tensorrt8.4的版本。(其他8.0+版本均可按照以下教程安装)
1.首先去官网下载
https://developer.nvidia.com/nvidia-tensorrt-8x-download
选择第一个tar版本的,下载下来后放在想要存放的文件夹中TensorRT 8.4 GA for Linux x86_64 and CUDA 11.0, 11.1, 11.2, 11.3, 11.4, 11.5, 11.6 and 11.7 TAR Package 2添加环境变量(修改为自己的路径)
sudo gedit ~/.bashrc
export LD_LIBRARY_PATH=/home/sensoro/TensorRT-8.4.2.4/lib:$LD_LIBRARY_PATH
export LIBRARY_PATH=/home/sensoro/TensorRT-8.4.2.4/lib::$LIBRARY_PATH
source ~/.bashrc
3安装tensorrt的python包。进入tensorrt压缩之后的文件夹,然后进入里面的python文件夹
4启动一个terminal,激活conda 环境,输入:
pip install tensorrt-8.4.2.4-cp36-none-linux_x86_64.whl
输入python
import tensorrt
tensorrt.__version__
如果能输出就表示安装正确
可以顺便安装其他两个包
1)安装 uff 包
cd ../uff # 切换到 uff 文件夹
pip install uff-0.6.9-py2.py3-none-any.whl
2)安装 graphsurgen 包
cd ../graphsurgeon # 切换到 graphsurgeon 文件夹
pip install graphsurgeon-0.3.2-py2.py3-none-any.whl
5.复制文件到系统路径(!!)
把TensorRT根目录中的/lib/下面的文件复制到 /usr/lib/下,
把TensorRT根目录中的/include/下面的文件复制到 /usr/include/下
sudo cp -r lib/* /usr/lib/
sudo cp -r include/* /usr/include/
以上完成安装
六 方法一:写脚本将onnx模型转换成trt
只需要修改其中的路径和输入数据的shape
import os
import tensorrt as trt
TRT_LOGGER = trt.Logger()
model_path = 'test.onnx'
engine_file_path = "test.trt"
EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH) # batchsize=1
with trt.Builder(TRT_LOGGER) as builder, builder.create_network(EXPLICIT_BATCH) \
as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
builder.max_batch_size = 1
if not os.path.exists(model_path):
print('ONNX file {} not found.'.format(model_path))
exit(0)
print('Loading ONNX file from path {}...'.format(model_path))
with open(model_path, 'rb') as model:
print('Beginning ONNX file parsing')
if not parser.parse(model.read()):
print('ERROR: Failed to parse the ONNX file.')
for error in range(parser.num_errors):
print(parser.get_error(error))
network.get_input(0).shape = [1,8, 3, 320, 320]
profile = builder.create_optimization_profile()
# FIXME: Hardcoded for ImageNet. The minimum/optimum/maximum dimensions of a dynamic input tensor are the same.
config = builder.create_builder_config()
config.add_optimization_profile(profile)
trt_model_engine = builder.build_engine(network, config)
trt_model_context = trt_model_engine.create_execution_context()
with open(engine_file_path, "wb") as f:
f.write(trt_model_engine.serialize())
方法2 :进入Tensorrt8.0.3文件夹下,有个bin文件夹,进入其中,打开一个Terminal,输入:
./trtexec --onnx=test.onnx --saveEngine=test.engine
可测试模型的运行时间
有的时候转engine的时候回报错
onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64
weights, while TensorRT does not natively support INT64. Attempting to
cast down to INT32.
是因为你的onnx是INT64权重生成的,而tensorrt是支持INT32 的所有要将onnx转为更简单的模型。需要用到 onnx-simplifier 使用 pip install onnx-simplifier就能直接安装了
安装完毕后就可以转了
python -m onnxsim ./test.onnx ./test_sim.onnx
以上是转换的全部过程,亲测没有问题,小伙伴们快搞起来吧