YOLOV4

论文:YOLOv4: Optimal Speed and Accuracy of Object Detection

Github:https://github.com/AlexeyAB/darknet

 

YOLOV4_第1张图片

论文主要针对各种可以提升精度的trick进行了整合,加入YOLOV3中,得到最终本文的YOLOV4。最终在coco上面达到了43.5%AP ,在Tesla V100 上达到了65FPS。性能+精度,好到爆炸。

 

主要贡献:

  1. 提出了一个高效快速的目标检测框架YOLOV4
  2. 分析验证了Bag-ofFreebies 和Bag-of-Specials 方法对检测框架训练和推理的影响。
  3. 更改方法,使得YOLOV4可以适应于单GPU训练,大大降低YOLOV4的训练门槛。

 

检测框架对比:

YOLOV4_第2张图片

输入Input:

 Image, Patches, Image Pyramid

骨架网络Backbones:

VGG16 , ResNet-50 , SpineNet, EfficientNet-B0/B7 , CSPResNeXt50 ,CSPDarknet53

颈部模块Neck:
• Additional blocks: SPP , ASPP, RFB, SAM
• Path-aggregation blocks: FPN , PAN ,NAS-FPN , Fully-connected FPN, BiFPN, ASFF , SFAM  

头部模块Heads:
• Dense Prediction (one-stage):RPN , SSD , YOLO , RetinaNet (anchor based), CornerNet, CenterNet , MatrixNet, FCOS (anchor free)
• Sparse Prediction (two-stage):Faster R-CNN , R-FCN , Mask RCNN (anchor based)RepPoints (anchor free)

 

Bag of freebies :

仅仅改变训练策略,并且只增加训练的开销,不增加推理测试的开销的改进,称为Bag of freebies。

We call these methods that only change the training strategy or only increase the training cost as “bag of freebies.”

用到的改进包括,

(1)数据增强data augmentation

brightness ,contrast ,hue ,saturation ,noise ,random scaling,cropping,flipping ,rotating ,CutOut, MixUp, CutMix

(2)正则化方法

DropOut, DropPath ,Spatial DropOut , or DropBlock

(3)难例挖掘

hard negative example mining ,online hard example mining ,focal loss ,label smoothing

(4)损失函数

MSE, IoU, GIoU, CIoU, DIoU

 

Bag of specials:

只通过增加很小的计算量就可以极大的提高模型精度的方法,称为Bag of specials。

For those plugin modules and post-processing methods that only increase the inference cost by a small amount but can significantly improve the accuracy of object detection, we call them “bag of specials”.

用到的改进包括,

(1)增大感受野

SPP , ASPP, RFB , Spatial Pyramid Matching (SPM)

(2)attention方法

Squeeze-and-Excitation (SE), Spatial Attention Module (SAM)

YOLOV4_第3张图片

(3)跳跃连接:

Residual connections, Weighted residual connections, Multi-input weighted residual connections,  Cross stage partial connections (CSP) , FPN ,SFAM  ,ASFF  ,BiFPN  

(4)激活函数:

ReLU, leaky-ReLU, parametric-ReLU, ReLU6, SELU, Swish, Mish 

(5)NMS

greedy NMS, soft NMS

(6)归一化方法:

Batch Normalization (BN) ,Cross-GPU Batch Normalization (CGBN or SyncBN), Filter Response Normalization (FRN) , Cross-Iteration Batch Normalization (CBN)

YOLOV4_第4张图片

 

网络基础结构的选择:

YOLOV4_第5张图片

CSPDarknet53比CSPResNext50 ,EfficientNet-B3具有更大的感受野,更快的速度,因此,选择CSPDarknet53作为YOLOV4的基础骨架。

 

分类精度高的模型不一定检测精度也高,

A reference model which is optimal for classification is not always optimal for a detector.

检测需要的条件,

  1. 更高的输入图片分辨率,有助于检测多尺度的小物体
  2. 更多的层,可以匹配更大的网络输入
  3. 更多的参数,使得模型有更大的包容力检测不同大小的物体

 

 

实验结果:

YOLOV4_第6张图片

YOLOV4_第7张图片

YOLOV4_第8张图片

 

 

 

你可能感兴趣的:(物体检测)