论文:YOLOv4: Optimal Speed and Accuracy of Object Detection
Github:https://github.com/AlexeyAB/darknet
论文主要针对各种可以提升精度的trick进行了整合,加入YOLOV3中,得到最终本文的YOLOV4。最终在coco上面达到了43.5%AP ,在Tesla V100 上达到了65FPS。性能+精度,好到爆炸。
主要贡献:
检测框架对比:
输入Input:
Image, Patches, Image Pyramid
骨架网络Backbones:
VGG16 , ResNet-50 , SpineNet, EfficientNet-B0/B7 , CSPResNeXt50 ,CSPDarknet53
颈部模块Neck:
• Additional blocks: SPP , ASPP, RFB, SAM
• Path-aggregation blocks: FPN , PAN ,NAS-FPN , Fully-connected FPN, BiFPN, ASFF , SFAM
头部模块Heads:
• Dense Prediction (one-stage):RPN , SSD , YOLO , RetinaNet (anchor based), CornerNet, CenterNet , MatrixNet, FCOS (anchor free)
• Sparse Prediction (two-stage):Faster R-CNN , R-FCN , Mask RCNN (anchor based)RepPoints (anchor free)
Bag of freebies :
仅仅改变训练策略,并且只增加训练的开销,不增加推理测试的开销的改进,称为Bag of freebies。
We call these methods that only change the training strategy or only increase the training cost as “bag of freebies.”
用到的改进包括,
(1)数据增强data augmentation
brightness ,contrast ,hue ,saturation ,noise ,random scaling,cropping,flipping ,rotating ,CutOut, MixUp, CutMix
(2)正则化方法
DropOut, DropPath ,Spatial DropOut , or DropBlock
(3)难例挖掘
hard negative example mining ,online hard example mining ,focal loss ,label smoothing
(4)损失函数
MSE, IoU, GIoU, CIoU, DIoU
Bag of specials:
只通过增加很小的计算量就可以极大的提高模型精度的方法,称为Bag of specials。
For those plugin modules and post-processing methods that only increase the inference cost by a small amount but can significantly improve the accuracy of object detection, we call them “bag of specials”.
用到的改进包括,
(1)增大感受野
SPP , ASPP, RFB , Spatial Pyramid Matching (SPM)
(2)attention方法
Squeeze-and-Excitation (SE), Spatial Attention Module (SAM)
(3)跳跃连接:
Residual connections, Weighted residual connections, Multi-input weighted residual connections, Cross stage partial connections (CSP) , FPN ,SFAM ,ASFF ,BiFPN
(4)激活函数:
ReLU, leaky-ReLU, parametric-ReLU, ReLU6, SELU, Swish, Mish
(5)NMS
greedy NMS, soft NMS
(6)归一化方法:
Batch Normalization (BN) ,Cross-GPU Batch Normalization (CGBN or SyncBN), Filter Response Normalization (FRN) , Cross-Iteration Batch Normalization (CBN)
网络基础结构的选择:
CSPDarknet53比CSPResNext50 ,EfficientNet-B3具有更大的感受野,更快的速度,因此,选择CSPDarknet53作为YOLOV4的基础骨架。
分类精度高的模型不一定检测精度也高,
A reference model which is optimal for classification is not always optimal for a detector.
检测需要的条件,
实验结果: