目标检测常见知识点

文章目录

  • 1、常见目标检测结构
  • 2、四种需要知道的特征金字塔网络
  • 3、Focal Loss
  • 4、FCOS
  • 5、YOLOX beat YOLOv5
  • 6、VFnet
  • 7、YOLO real time :model for edge devices
  • 8、FCOS3D: winner of a 3D Detection Challenge
  • 总结

1、常见目标检测结构

需要了解的常见的目标检测结构可以分为单阶段(one-stage detectors osd)和双阶段目标检测(two-stage detectors tsd),无论是单阶段还是双阶段的目标检测测算法,都可以是anchor-based 或 anchor-free

单阶段目标检测使用所有特征图来预测bounding boxes 和label :Dense prediction;两阶段目标检测还有一个extracting proposals (regions of interest) ,propoals are used to extract feature map regions to predict bounding boxes and labels: Sparse prediction.

两阶段目标检测不使用全部特征来做预测。TSD 比如faster rcnn往往要以STD 如SSD精度要高;但STD: efficientdet ,retinanet,vfnet,yolox,etc结果要比两阶段的好;单阶段的速度往往要更快。

2、四种需要知道的特征金字塔网络

四种Feature Pyramid Network (FPN),分别是FPN,PANet,NAS-FPN,BiFPN.
(1) FPN use a top-down pathway to fuse multi-scale features from level 3-7
(2) PANet adds an additional bottom-up pathway on top of FPN
(3)NAS-FPN use neural architecture search to find an irregular feature network topology and then repeatly apply the same block
(4)BiFPN is a bit similar to PAnet,adds shortcut fusing and then repeatedly apply the same block
下图是efficientnet的图中示意:

除了以上几中当然还有其它设计方式。

3、Focal Loss

focal loss是在retinanet中提出的,是用来解决foreground-background类别不均衡而提出的。
目标检测常见知识点_第1张图片
它是从交叉熵损失函数来的,除低对识别正确的样本的权重;该损失应用到多个单阶段目标检测的算法中:efficientdet,FCOS,VFNet and many other models。在两阶段目标检测中,如sparse R-CNN;该损失函数帮助retinanet首次超过双阶段目标检测算法。该损失函数也可以用到分类任务中。

4、FCOS

FCOS is an anchor-free object detector,是第一个不基于anchor,但可以和基于anchor的单阶段或双阶段目标检测模型比较。理解该模型会有较大帮助。
目标检测常见知识点_第2张图片
fcos重新定义目标检测,基于每个像素的检测;它使用多级预测来提高召回率并解决重叠边界框导致的歧义.它提出了“center-ness”分支,有助于抑制检测到的低质量边界框,并大幅提高整体性能.它避免了复杂的计算,例如并集交叉(IoU),FCOS 方法也用于 VFNet、YOLOX 和其他一些模型.这有一个https://www.youtube.com/watch?v=_ADYE6QaAAY&list=PL_ekkPIRkFtF8Gajuj9Bb6dyJK1QFku9E讲解。

5、YOLOX beat YOLOv5

目标检测常见知识点_第3张图片

考虑到 YOLOv4 和 YOLOv5 对基于锚点的管道可能有点过度优化,他们选择 YOLOv3 作为起点。将Yolov3发展到Yolov5使用以下方法:

  • decopled head
  • Anchor-free architecture (like FCOS)
  • dvanced label assigning strategy (SimOTA)
    They also used some other training strategies such as: adding EMA weights updating, cosine LR schedule, IoU loss, and IoU-aware branch.

6、VFnet

VariFocalNet: An IoU-aware Dense Object Detector
准确排列候选检测对于密集物体检测器实现高性能至关重要,先前的工作使用分类分数或分类和预测的定位分数(中心度)的组合来对候选者进行排名,这两个分数仍然不是最佳的,VFNet 提出 IoU-Aware 分类分数 (IACS),iACS 被用作使用 IoU 的对象存在置信度和定位精度的联合表示;VFNet introduces the VariFocal Loss;VariFocal Loss 仅降低负样本以解决训练期间的类别不平衡问题,VariFocal Loss 对生成j最初检测的高质量正例进行加权,VFNet 基于 FCOS+ATSS,去除了中心性分支;有三个新的结构:The VariFocal Loss,The star-shaped bounding box feature representation,The bounding box refinement,VFNet also uses GIoU Loss for both bounding boxes branches ,ariFocal Loss consistently improved RetinaNet, FoveaBox and ATSS by 0.9 AP, and by 1.4 AP for RepPoints
目标检测常见知识点_第4张图片

7、YOLO real time :model for edge devices

It achieves 68.75 mAP on Pascal VOC and 34.91 mAP on COCO using MobileNetV2×0.75 backbone
目标检测常见知识点_第5张图片
Both model accuracy and execution time (Frame Per Second) are crucial when deploying a model on edge device. YOLO-ReT is based on these 2 ideas:
ackbone Truncation: Only 60% of the backbone is initialised with pretrained weights. Using all the weights harms model accuracy;
Raw Feature Collection and Redistribution (RFCR):Fuse {C2, C3, C4} into C5 layer (fused feature map),Discard last CNN layers,Pass the fused feature map through a 5x5 Mobile Convolution block (MBConv),At each scale, concatenate (shortcut) the current feature map with the fused feature map from the previous step,Both YOLO-ReT tricks (Backbone & RFCR) could be used in other models too

8、FCOS3D: winner of a 3D Detection Challenge

FCOS3D won the 1st place out of all the vision-only methods in the nuScenes 3D Detection Challenge of NeurIPS 2020. Here is a brief description:FCOS3D is a monocular 3D object detector。It’s an anchor-free model based on FCOS (2D) counterpart。It replaces the FCOS regression branch by 6 branches,The center-ness is redefined with a 2D Gaussian distribution based on the 3D-center,The authors showed some failure cases, mainly focused on the detection of large objects and occluded objects.

总结

目标检测常见知识点_第6张图片
本文主要是学习twitter上ai_fast_track帐户上的内容。

你可能感兴趣的:(目标检测,目标检测,计算机视觉,深度学习)