第二步:pytorch模型转onnx模型步骤和可能遇到的问题

一、环境

本地(win10):cuda11.0 + python3.7.4 + pytorch1.7.1 + onnx1.9.0
服务器(Ubuntu18.04):cuda11.0 + python3.7.6 + pytorch1.7.1 + tensorrt8.0.0.3

二、可能出现的问题

1. pytorch与tensorrt版本对应问题

将pytorch1.7.1的模型降至1.5.0以下版本,再转为onnx模型。以efficientnet_b1为例:

import torch
import configs
import sys
sys.path.append('face_classify')
from timm.models.factory import create_model
from collections import OrderedDict

def torch_2_onnx(model, MODEL_ONNX_PATH):
    OPERATOR_EXPORT_TYPE = torch._C._onnx.OperatorExportTypes.ONNX
    """
    这里构建网络的输入,有几个就构建几个
    和网络正常的inference时输入一致就可以
    """
    device = 'cpu'
    org_dummy_input = torch.randn(8, 3, 240, 240, device=device)

    # 这是支持动态输入和输出的第一步
    # 每一个输入和输出中动态的维度都要在此注明,下标从0开始
    # 注意:1通道channels不能动态输入,所以没有1
    dynamic_axes = {
        'inputs': {0: 'batch_size', 2: 'height', 3: 'width'},
        'outputs': {0: 'batch_size', 1: 'class'},
    }
    output = torch.onnx.export(model,
                               org_dummy_input,
                               MODEL_ONNX_PATH,
                               verbose=True,
                               opset_version=12,
                               operator_export_type=OPERATOR_EXPORT_TYPE,
                               input_names=['inputs'],
                               output_names=['outputs'],
                               dynamic_axes=dynamic_axes
                               )
    print("Export of model to {}".format(MODEL_ONNX_PATH))

model = create_model(
            model_name=configs.model_name,
            num_classes=configs.num_classes,
            in_chans=configs.in_chans,
            checkpoint_path=configs.face_classify_model)

# 必须使用cpu,否则最后一步torch.onnx.export会报错
device = 'cpu'

# 读取1.7版pytorch权重。根据自己权重文件格式读取参数
checkpoint = torch.load(r'/path/to/efficientnet_b1.pth.tar',
                        map_location=device)
state_dict_key = 'state_dict'
if state_dict_key and state_dict_key in checkpoint:
    new_state_dict = OrderedDict()
    for k, v in checkpoint[state_dict_key].items():
        name = k[7:] if k.startswith('module') else k
        new_state_dict[name] = v
    state_dict = new_state_dict
else:
    state_dict = checkpoint
model.load_state_dict(state_dict)

# 保存成1.4版本的pytorch模型
torch.save(model.state_dict(), 'onnx_weights/for_onnx_eff_b1.pth', _use_new_zipfile_serialization=False)

# 重新读取,导出pytorch1.4版本对应的onnx
state_dict = torch.load('onnx_weights/for_onnx_eff_b1.pth', map_location=device)
model.load_state_dict(state_dict)
model.train(False)

model_onnx_path = 'efficientnet_b1.onnx'
torch_2_onnx(model, model_onnx_path)

运行后即可生成efficientnet_b1.onnx,在后续导入tensorrt时可直接使用。

2. onnx与pytorch的一些方法不兼容

最后一步转换报错:RuntimeError: Exporting the operator silu to ONNX opset version 9 is not supported. Please open a bug to request ONNX export support for the missing operator. 原因是onnx不支持silu,解决方法是找到源码位置,换一种写法。源码位置:D:\apps\anaconda3\envs\python3.7\Lib\site-packages\torch\nn\modules\activation.py
原代码:

class SiLU(Module):

    __constants__ = ['inplace']
    inplace: bool

    def __init__(self, inplace: bool = False):
        super(SiLU, self).__init__()
        self.inplace = inplace

    def forward(self, input: Tensor) -> Tensor:
        # ------------------------------------- #
        return F.silu(input, inplace=self.inplace)

修改之后:

class SiLU(Module):

    __constants__ = ['inplace']
    inplace: bool

    def __init__(self, inplace: bool = False):
        super(SiLU, self).__init__()
        self.inplace = inplace

    def forward(self, input: Tensor) -> Tensor:
        # 把F.silu替换掉
        return input * torch.sigmoid(input)

你可能感兴趣的:(tensorrt+onnx,深度学习,pytorch)