Lane Shape Prediction with Transformers GitHub - liuruijin17/LSTR: This is an official repository of End-to-end Lane Shape Prediction with Transformers.出来虽然两年了但依然是一个非常优秀的车道线检测模型,可以同来拓展检测其他线形,而且占用资源较少推理速度非常快。但是,它的使用python3.6和torch1.5等软件版本比较老了,写导出的onnx脚本在该环境下导出的onnx用现在比较新的TensorRT8.4来解析生成engine时可能会报错,python3.6下即使把torch升级到1.10.1后也还是存在这个问题,例如用下面的脚本导出onnx:
torch.onnx.export(model,
images, # model input (or a tuple for multiple inputs)
onnx_file,
export_params=True, # store the trained parameter weights inside the model file
opset_version=11,
do_constant_folding=True, # whether to execute constant folding for optimization
input_names = ['images', 'masks'], # the model's input names
output_names = ['output_class', 'output_curve'])
会报错:
/root/anaconda3/envs/lstr/lib/python3.6/site-packages/torch/onnx/symbolic_helper.py:325: UserWarning: Type cannot be inferred, which might cause exported graph to produce incorrect results.
warnings.warn("Type cannot be inferred, which might cause exported graph to produce incorrect results.")
[W shape_type_inference.cpp:434] Warning: Constant folding in symbolic shape inference fails: shape '[240, 1, 1, 240]' is invalid for input of size 240 (function ComputeConstantFolding)
[W shape_type_inference.cpp:434] Warning: Constant folding in symbolic shape inference fails: shape '[3, 1, 1, 240]' is invalid for input of size 240 (function ComputeConstantFolding)
[W shape_type_inference.cpp:434] Warning: Constant folding in symbolic shape inference fails: shape '[3, 1, 1, 240]' is invalid for input of size 240 (function ComputeConstantFolding)
Traceback (most recent call last):
File "experiments/export2onnx.py", line 81, in
export2onnx(model, onnx_file)
File "experiments/export2onnx.py", line 54, in export2onnx
output_names = ['output_class', 'output_curve'])
File "/root/anaconda3/envs/lstr/lib/python3.6/site-packages/torch/onnx/__init__.py", line 320, in export
custom_opsets, enable_onnx_checker, use_external_data_format)
File "/root/anaconda3/envs/lstr/lib/python3.6/site-packages/torch/onnx/utils.py", line 111, in export
custom_opsets=custom_opsets, use_external_data_format=use_external_data_format)
File "/root/anaconda3/envs/lstr/lib/python3.6/site-packages/torch/onnx/utils.py", line 729, in _export
dynamic_axes=dynamic_axes)
File "/root/anaconda3/envs/lstr/lib/python3.6/site-packages/torch/onnx/utils.py", line 545, in _model_to_graph
_export_onnx_opset_version)
RuntimeError: shape '[240, 1, 1, 240]' is invalid for input of size 240
虽然把上面torch.onnx.export()的参数do_constant_folding=True改成do_constant_folding=False后可以导出onnx不报错,但是生成的onnx用TensorRT解析生成engine时会报错有中间节点的数据维度不对:
[07/13/2023-05:11:04] [I] [TRT] MatMul_279: broadcasting input1 to make tensors conform, dims(input0)=[240,1,32][NONE] dims(input1)=[1,32,32][NONE].
[07/13/2023-05:11:04] [I] [TRT] MatMul_282: broadcasting input1 to make tensors conform, dims(input0)=[240,1,32][NONE] dims(input1)=[1,32,32][NONE].
[07/13/2023-05:11:04] [E] Error[4]: [graphShapeAnalyzer.cpp::analyzeShapes::1294] Error Code 4: Miscellaneous (IShuffleLayer Reshape_289: reshape changes volume. Reshaping [240,1,32] to [240,480,16].)
[07/13/2023-05:11:04] [E] [TRT] ModelImporter.cpp:773: While parsing node number 289 [Reshape -> "493"]:
[07/13/2023-05:11:04] [E] [TRT] ModelImporter.cpp:774: --- Begin node ---
[07/13/2023-05:11:04] [E] [TRT] ModelImporter.cpp:775: input: "481"
input: "492"
output: "493"
name: "Reshape_289"
op_type: "Reshape"
[07/13/2023-05:11:04] [E] [TRT] ModelImporter.cpp:776: --- End node ---
[07/13/2023-05:11:04] [E] [TRT] ModelImporter.cpp:779: ERROR: ModelImporter.cpp:180 In function parseGraph:
[6] Invalid Node - Reshape_289
[graphShapeAnalyzer.cpp::analyzeShapes::1294] Error Code 4: Miscellaneous (IShuffleLayer Reshape_289: reshape changes volume. Reshaping [240,1,32] to [240,480,16].)
[07/13/2023-05:11:04] [E] Failed to parse onnx file
[07/13/2023-05:11:04] [I] Finish parsing network model
[07/13/2023-05:11:04] [E] Parsing model failed
[07/13/2023-05:11:04] [E] Failed to create engine from model or file.
[07/13/2023-05:11:04] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8401] # /usr/src/tensorrt/bin/trtexec --onnx=LSTR_iter_10000.onnx --saveEngine=LSTR_iter_10000.trt
后来发现使用更高版本的python3.7或3.8 + torch1.12.1 导出onnx时可以使用参数do_constant_folding=True,并且导出的onnx用TensorRT8.4.1.5可以成功解析出engine。
环境安装步骤可以总结如下:
1. 创建conda 环境lstr
conda create -n lstr python=3.8
2.修改requirment.txt, 把mkl-service、numpy和torch及torchvision的版本升级:
mkl-service==2.4.0
numpy==1.18.5
torch==1.12.1
torchvision==0.13.1
完整的requirements.txt如下:
alembic==1.4.2
amqp-worker==0.2
axial-positional-embedding==0.2.1
certifi==2020.4.5.1
cloudpickle==1.4.1
cycler==0.10.0
Cython==0.29.17
decorator==4.4.2
diff-match-patch==20120106
dill==0.3.3
docutils==0.16
future==0.18.2
h5py==2.10.0
imageio==2.8.0
imgaug==0.4.0
joblib==0.15.1
kiwisolver==1.2.0
lockfile==0.12.2
Mako==1.1.3
MarkupSafe==1.1.1
matplotlib==3.2.1
# update
mkl-service==2.4.0
multiprocess==0.70.11.1
networkx==2.4
# update
numpy==1.18.5
opencv-contrib-python==4.2.0.34
opencv-python==4.2.0.34
p-tqdm==1.3.3
pandas==1.0.4
pathos==0.2.7
pika==1.1.0
Pillow==7.1.2
pox==0.2.9
ppft==1.6.6.3
product-key-memory==0.1.10
progressbar==2.5
progressbar2==3.53.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
pyparsing==2.4.7
python-daemon==2.2.4
python-dateutil==2.8.1
python-editor==1.0.4
python-utils==2.4.0
pytz==2020.1
PyWavelets==1.1.1
scikit-image==0.17.2
scikit-learn==0.23.1
scipy==1.4.1
Shapely==1.7.0
six==1.14.0
sklearn==0.0
SQLAlchemy==1.3.17
submitit==1.0.0
tabulate==0.8.7
thop==0.0.31.post2005241907
threadpoolctl==2.1.0
tifffile==2020.6.3
# update
torch==1.12.1
torchvision==0.13.1
tqdm==4.46.1
typing-extensions==3.7.4.2
ujson==3.0.0
3. 安装相关支持包:
pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/
然后即可跑训练或测试:
python train.py LSTR
python train.py LSTR --iter 10000
python test.py LSTR --modality images --split testing --debug --image_root metr
和导出onnx(github上源码里是没有提供导出onnx的脚本的,需要自己参考test.py以及通常导出onnx的写法自己写一个脚本)
python experiments/export2onnx.py