YOLOV8导出onnx模型,再导出RKNN模型

一、代码下载

C++完整部署代码和模型示例 :yolov8_rknn_Cplusplus
Public
yolov8官方源码:https://github.com/ultralytics/ultralytics/tree/main

利用yolov8官方源码修改部分代码,导出onnx模型。再将onnx模型导出rknn模型

二、代码修改

训练代码参考官方开源的yolov8训练代码,由于SiLU在有些板端芯片上还不支持,因此将其改为ReLU。
备注:将激活函数用ReLu代替SiLu,推理速度快了不少

class Conv(nn.Module):
    # Standard convolution with args(ch_in, ch_out, kernel, stride, padding, groups, dilation, activation)
    #default_act = nn.SiLU()  # 默认激活函数
    default_act = nn.ReLU()   # 改动,silu在rknn上不能和conv融合
    def __init__(self, c1, c2, k=1, s=1, p=None, g=1, d=1, act=True):
        super().__init__()
        self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p, d), groups=g, dilation=d, bias=False)
        self.bn = nn.BatchNorm2d(c2)
        self.act = self.default_act if act is True else act if isinstance(act, nn.Module) else nn.Identity()

导出 yolov8 onnx的代码修改
YOLOV8后处理中有些算子在板端芯片上效率低或者不支持,导出 onnx 需要将板端芯片不友好或不支持算子规避掉。导出onnx修改的部分。

class Detect(nn.Module):
    """YOLOv8 Detect head for detection models."""
    dynamic = False  # force grid reconstruction
    export = False  # export mode
    shape = None
    anchors = torch.empty(0)  # init
    strides = torch.empty(0)  # init

    ###  添加适用于RKNN导出onnx的部分代码
    conv1x1 = nn.Conv2d(16, 1, 1, bias=False).requires_grad_(False)
    aaa = torch.arange(16, dtype=torch.float)
    conv1x1.weight.data[:] = nn.Parameter(aaa.view(1, 16, 1, 1))
    ###
    
    def __init__(self, nc=80, ch=()):  # detection layer
        super().__init__()
        self.nc = nc  # number of classes
        self.nl = len(ch)  # number of detection layers
        self.reg_max = 16  # DFL channels (ch[0] // 16 to scale 4/8/12/16/20 for n/s/m/l/x)
        self.no = nc + self.reg_max * 4  # number of outputs per anchor
        self.stride = torch.zeros(self.nl)  # strides computed during build 
        c2, c3 = max((16, ch[0] // 4, self.reg_max * 4)), max(ch[0], min(self.nc, 100))  # channels
        self.cv2 = nn.ModuleList(
            nn.Sequential(Conv(x, c2, 3), Conv(c2, c2, 3), nn.Conv2d(c2, 4 * self.reg_max, 1)) for x in ch)
        self.cv3 = nn.ModuleList(nn.Sequential(Conv(x, c3, 3), Conv(c3, c3, 3), nn.Conv2d(c3, self.nc, 1)) for x in ch)
        self.dfl = DFL(self.reg_max) if self.reg_max > 1 else nn.Identity()
        # dfl:  DFL(Distribution Focal Loss)

    def forward(self, x):
        """Concatenates and returns predicted bounding boxes and class probabilities."""
        shape = x[0].shape  # BCHW
         ###  添加适用于RKNN导出onnx的部分代码
        if self.export and self.format in ('onnx'):
            y = []
            for i in range(self.nl):
                t1 = self.cv2[i](x[i])
                t2 = self.cv3[i](x[i])
                y.append(self.conv1x1(t1.view(t1.shape[0], 4, 16, -1).transpose(2, 1).softmax(1)))
                y.append(t2)
            return y
        ###
        for i in range(self.nl):
            x[i] = torch.cat((self.cv2[i](x[i]), self.cv3[i](x[i])), 1)
        if self.training:
            return x
        elif self.dynamic or self.shape != shape:    
            self.anchors, self.strides = (x.transpose(0, 1) for x in make_anchors(x, self.stride, 0.5))
            self.shape = shape

        x_cat = torch.cat([xi.view(shape[0], self.no, -1) for xi in x], 2)
        if self.export and self.format in ('saved_model', 'pb', 'tflite', 'edgetpu', 'tfjs'):  # avoid TF FlexSplitV ops
            box = x_cat[:, :self.reg_max * 4]
            cls = x_cat[:, self.reg_max * 4:]
        else:
            box, cls = x_cat.split((self.reg_max * 4, self.nc), 1)
        dbox = dist2bbox(self.dfl(box), self.anchors.unsqueeze(0), xywh=True, dim=1) * self.strides

        if self.export and self.format in ('tflite', 'edgetpu'):
            # Normalize xywh with image size to mitigate quantization error of TFLite integer models as done in YOLOv5:
            # https://github.com/ultralytics/yolov5/blob/0c8de3fca4a702f8ff5c435e67f378d1fce70243/models/tf.py#L307-L309
            # See this PR for details: https://github.com/ultralytics/ultralytics/pull/1695
            img_h = shape[2] * self.stride[0]
            img_w = shape[3] * self.stride[0]
            img_size = torch.tensor([img_w, img_h, img_w, img_h], device=dbox.device).reshape(1, 4, 1)
            dbox /= img_size

        y = torch.cat((dbox, cls.sigmoid()), 1)
        return y if self.export else (y, x)

三、导出onnx模型,再导出RKNN模型

导出onnx模型的命令

yolo export model=/**/weights/best.pt format=onnx opset=12

导出RKNN模型, 运行yolov8n_onnx_tensorRT_rknn_horizon/yolov8_rknn/onnx2rknn_demo_ZQ.py

参考链接:
yolov8 瑞芯微RKNN和地平线Horizon芯片仿真测试部署

你可能感兴趣的:(yolo,YOLO)