List of papers about Object Detection.
(box AP)
Detector | Backbone | VOC07 | VOC12 | COCO | Speed | Publish |
---|---|---|---|---|---|---|
R-CNN | AlexNet | 58.5 | 53.3 | CVPR14-Rich feature hierarchies for accurate object detection and semantic segmentation | ||
R-CNN | VGG16 | 66 | CVPR14 | |||
SPP-Net | ZF-5 | 54.2 | ECCV14-Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition | |||
DeepID-Net | 64.1 | |||||
NoC | 73.3 | 68.8 | ||||
DCN-BOSP | 68.5 | 66.4 | CVPR15-Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction | |||
DeepBox | 37.8 | ICCV15-DeepBox: Learning Objectness with Convolutional Networks | ||||
AttentionNet | AttentionNet + Refine + R-CNN | 69.8 | 72 | ICCV15-AttentionNet: Aggregating Weak Directions for Accurate Object Detection | ||
DeepProposal | ICCV15-DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers | |||||
MR-CNN | 78.2 | 73.9 | ICCV15-Object detection via a multi-region & semantic segmentation-aware CNN model | |||
Fast R-CNN | VGG16 | 70 | 68.4 | 19.7 | ICCV15 | |
Faster R-CNN | VGG16 | 73.2/78.8 | 70.4/75.9 | 21.9 | 198ms | NeurIPS15-Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks |
YOLO | VGG16? | 66.4 | 57.9 | CVPR16-You Only Look Once: Unified, Real-Time Object Detection | ||
G-CNN | VGG16 | 66.8 | 66.4 | CVPR16-G-CNN: an Iterative Grid Based Object Detector | ||
AZNet | VGG16 | 70.4 | 22.3 | CVPR16-Adaptive Object Detection Using Adjacency and Zoom Prediction | ||
ION | VGG16 | 80.1 | 77.9 | 33.1 | CVPR16-Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks | |
CRAFT | 75.7 | 71.3 | CVPR16-CRAFT Objects from Images | |||
HyperNet | 76.3 | 71.4 | CVPR16-HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection | |||
OHEM | 78.9 | 76.3 | 22.4 | CVPR16-Training Region-based Object Detectors with Online Hard Example Mining | ||
CRAPF | CVPR16-CRAFT Objects from Images | |||||
MPN | 33.2 | BMVC16-A MultiPath Network for Object Detection | ||||
SSD | 76.8 | 74.9 | 31.2 | ECCV16-SSD: Single Shot MultiBox Detector | ||
GBDNet | 77.2 | 27 | ECCV16-Crafting GBD-Net for Object Detection | |||
CPF | 76.4 | 72.6 | ECCV16-Contextual Priming and Feedback for Faster R-CNN | |||
MS-CNN | ECCV16-A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection | |||||
R-FCN | ResNet101 | 79.5 | 77.6 | 29.9 | NeurIPS16-R-FCN: Object Detection via Region-based Fully Convolutional Networks | |
PVANet9.0 | 84.9 | 84.2 | NIPSW16-PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection | |||
DeepID-Net | 69 | PAMI16 | ||||
NoC | 71.6 | 68.8 | 27.2 | TPAMI16-Object Detection Networks on Convolutional Feature Maps | ||
DSSD | 81.5 | 80.0 | 33.2 | arxiv17 -DSSD : Deconvolutional Single Shot Detector | ||
TDM | 37.3 | CVPR17-Beyond Skip Connections: Top-Down Modulation for Object Detection | ||||
FPN | 36.2 | CVPR17-Feature Pyramid Networks for Object Detection | ||||
YOLOv2 | DarkNet-19 | 78.6 | 73.4 | CVPR17-YOLO9000: Better, Faster, Stronger | ||
RON | 77.6 | 75.4 | 27.4 | CVPR17-RON: Reverse Connection with Objectness Prior Networks for Object Detection | ||
RSA | ICCV17-Recurrent Scale Approximation for Object Detection in CNN | |||||
DeNet | 77.1 | 73.9 | 33.8 | ICCV17-DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling | ||
CoupleNet | 82.7 | 80.4 | 34.4 | ICCV17-CoupleNet: Coupling Global Structure with Local Parts for Object Detection | ||
RetinaNet | ResNeXt101+FPN ? | 39.1 | ICCV17-Focal Loss for Dense Object Detection | |||
Mask R-CNN | ResNeXt101 | 39.8 | ICCV17 | |||
Mask R-CNN | ResNet101-FPN | 38.2 | ICCV17 | |||
DSOD | 77.7 | 76.3 | ICCV17 -DSOD: Learning Deeply Supervised Object Detectors from Scratch | |||
SMN | 70.0 | ICCV17-Spatial Memory for Context Reasoning in Object Detection | ||||
DCN | Aligned-Inception-ResNet | 81.5 | 37.5 | ICCV17-Deformable Convolutional Networks | ||
Light-Head R-CNN | Xception* | 41.5? | arxiv17-Light-Head R-CNN: In Defense of Two-Stage Object Detector | |||
YOLOv3 | DarkNet53 | 33 | arxiv18 -YOLOv3: An Incremental Improvement | |||
SIN | VGG16? | 76 | 73.1 | 23.2 | CVPR18-Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships | |
STDN | DenseNet-169 | 80.9 | 31.8 | CVPR18-Scale-Transferrable Object Detection | ||
RefineDet | ResNet101 | 83.8 | 83.5 | 41.8 | CVPR18-Single-Shot Refinement Neural Network for Object Detection | |
MegDet | CVPR18-MegDet: A Large Mini-Batch Object Detector | |||||
DA Faster R-CNN | CVPR18-Domain Adaptive Faster R-CNN for Object Detection in the Wild | |||||
SNIP | 45.7 | CVPR18-An Analysis of Scale Invariance in Object Detection – SNIP | ||||
Relation-Network | 32.5 | CVPR18-Relation Networks for Object Detection | ||||
Cascade R-CNN | ResNet-101 | 42.8 | CVPR18-Cascade R-CNN: Delving into High Quality Object Detection | |||
MLKP | 80.6 | 77.2 | 28.6 | CVPR18-Multi-scale Location-aware Kernel Representation for Object Detection | ||
Fitness-NMS | 41.8 | CVPR18-Improving Object Localization with Fitness NMS and Bounded IoU Loss | ||||
PANet | ResNet50-FPN | 41.2 | CVPR18 | |||
PANet | ResNeXt101 | 47.4 | CVPR18 | |||
STDNet | BMVC18-STDnet: A ConvNet for Small Target Detection | |||||
RFBNet | 82.2 | ECCV18-Receptive Field Block Net for Accurate and Fast Object Detection | ||||
CornerNet | Hourglass104 | 42.1 | ECCV18- CornerNet: Detecting Objects as Paired Keypoints | |||
PFPNet | 84.1 | 83.7 | 39.4 | ECCV18-Parallel Feature Pyramid Network for Object Detection | ||
Softer-NMS | arxiv18-Softer-NMS: Rethinking Bounding Box Regression for Accurate Object Detection | |||||
ShapeShifter | ECML-PKDD’ 18-ShapeShifter: Robust Physical Adversarial Attack on Faster R-CNN Object Detector | |||||
Pelee | 70.9 | NeurIPS18-Pelee: A Real-Time Object Detection System on Mobile Devices | ||||
HKRM | 78.8 | 37.8 | NeurIPS18-Hybrid Knowledge Routed Modules for Large-scale Object Detection | |||
SNIPER | NeurIPS18-SNIPER: Efficient Multi-Scale Training | |||||
FishNet | 43.3 | NeurIPS18 | ||||
M2Det | 44.2 | AAAI19-M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network | ||||
R-DAD | Res152 | 81.2 | 82 | 43.1 | AAAI19-Object Detection based on Region Decomposition and Assembly | |
Feature Intertwiner | 42.5/44.2 | ICLR19-Feature Intertwiner for Object Detection | ||||
Locating Objects Without Bounding Boxes | CVPR19 | |||||
GIoU | CVPR19 | |||||
Cascade Mask R-CNN | ResNeXt-101-FPN | 46.6 | ||||
HTC(Hybrid Task Cascade) | ResNeXt-101-FPN | 47.1 | CVPR19 | |||
Guided Anchoring(GA-RPN) | CVPR19 | |||||
Libra R-CNN | ResNeXt-101-FPN | 43 | CVPR19-Libra R-CNN: Balanced Learning for Object Detection | |||
SNIPER | 47.6 | Arxiv19 May- SNIPER: Efficient Multi-Scale Training | ||||
Cascade RetinaNet | ResNet101 | 41.1 | BMVC19_Cascade RetinaNet: Maintaining Consistency for Single-Stage Object Detection | |||
GFR | 30 | BMVC19_Improving Object Detection from Scratch via Gated Feature Reuse | ||||
TridentNet | ResNet101-DCN | 48.4 | ICCV19-Scale-Aware Trident NetWorks for Object Detection | |||
HTC+DCN (extra training) | ResNeXt-101-FPN | 50.7 | CVPR19 | |||
NAS-FPN | 48.3 | CVPR19-Learning Scalable Feature Pyramid Architecture for Object Detection | ||||
CornerNet-Saccade+gt attention | 50.3 | Arxiv19. apr | ||||
Cascade R-CNN | ResNeXt-152 | 50.9 | Arxiv19. Jun | |||
Learning Data Augmentation Strategies for Object Detection | 50.7 | Arxiv19. Jun | ||||
ThunderNet | seNet535 | 28 | ICCV19_ThunderNet:Towards Real-time Generic Object Detection | |||
CARAFE | MaskRCNN-Res50 | 38.8 | ICCV19_CARAFE: Content-Aware ReAssembly of FEatures | |||
LIP | FRCN-res101 | 43.9 | ICCV19_LIP: Local Importaance-based Pooling | |||
FreeAnchor | ResNeXt101 | 44.8 | NeurIPS19_FreeAnchor: Learning to Match Anchors for Visual Object Detection | |||
Cascade RPN | 41.6 | NeurIPS19 | ||||
Efficient Neural Architecture Transformation Search in Channel-Level for Object Detection | better | NeurIPS19_ | ||||
DetNAS | better | NeurIPS19_DetNAS: Backbone Search for Object Detection | ||||
CBNet | Cascade Mask R-CNN + Dual-ResNeXt152 | 52.8 | Arxiv19 - A Novel Composite Backbone Network Architecture for Object Detection | |||
CBNet | Cascade Mask R-CNN + Triple-ResNeXt152 | 53.3 | Arxiv19 - A Novel Composite Backbone Network Architecture for Object Detection |
(mask AP)
Detector | Backbone | VOC07 | VOC12 | COCO | Speed | Publish |
---|---|---|---|---|---|---|
Mask R-CNN | ResNet-50-FPN | 35.6 | ICCV17 | |||
PANet | ResNet-50-FPN | 36.6 | ||||
Cascade Mask R-CNN | ResNeXt-101-FPN | 40.1 | ||||
HTC(Hybrid Task Cascade) | ResNeXt-101-FPN | 41.2 | CVPR19 | |||
HTC(Hybrid Task Cascade) | SENet-154 +ResNeXt-101 64x4d & 32x8d + DPN-107 + FishNet | 49 | CVPR19 | |||
MS R-CNN(Mask Scoring) | ResNet-101 DCN+FPN | 39.6(COCO2017) | CVPR19 |
Detector | Backbone | VOC07 | VOC12 | COCO | Speed | Publish |
---|---|---|---|---|---|---|
OverFeat | ICLR14-OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks | |||||
MultiBox | CVPR14-Scalable Object Detection using Deep Neural Networks | |||||
DenseBox | CVPR15 | |||||
MultiGrasp | ICRA15 | |||||
UnitBox | ACM_MM16 | |||||
YOLOv1 | VGG16? | 66.4 | 57.9 | CVPR16 | ||
YOLOv2 | DarkNet-19 | 21.6 | CVPR17 | |||
YOLOv3 | DarkNet53 | 33 | arxiv18 | |||
CornerNet | Hourglass104 | 40.5/42.1 | ECCV18 | |||
Learning Region Features for Object Detection | ResNet101-FPN | 39.9 | ECCV18 | |||
Grid R-CNN | ResNeXt-101 | 43.2 | CVPR18 | |||
MetaAnchor | ResNet101-FPN | 83.3 | 37.9 | NeurIPS18-MetaAnchor: Learning to Detect Objects with Customized Anchors | ||
DeRPN | AAAI19 | |||||
GA-RPN | ResNet-50-FPN | 39.6 | CVPR19 | |||
FSAF | ResNeXt101 | 42.9/44.6 | CVPR19 | |||
ExtremeNet | Hourglass104 | 40.2/43.7 | CVPR19-Bottom-up Object Detection by Grouping Extreme and Center Points | |||
DuBox | VGG-16 | 82.89 | 82.01 | 39.52 | arxiv19.Apr | |
CenterNet- Keypoint Triplets for Object Detection | Hourglass-104 | 44.9/47 | arxiv19.Apr | |||
FCOS | ResNeXt-32x8d-101-FPN | 42.1 | ICCV19-arxiv19.Apr | |||
FoveaBox | ResNeXt-101 | 42.1 | arxiv19.Apr | |||
Objects as Points (CenterNet) | Hourglass104 | 42.1/45.1 | arxiv19.Apr | |||
RPDet(RepPoints) | ResNet-101-DCN | 42.8 | ICCV19 | |||
CornerNet-Lite | Hourglass-54 | 43.2 | arxiv19.Apr | |||
LW-RetinaNet | 35.4 | arxiv19.May | ||||
Matrix Nets | ResNeXt-101-X | 47.8 | arxiv19. Aug | |||
FreeAnchor | ResNeXt101 | 44.8 | NeurIPS19_FreeAnchor: Learning to Match Anchors for Visual Object Detection |
Dataset | Article | Publish | |
---|---|---|---|
KITTI | Andreas Geiger and Philip Lenz and Raquel Urtasun | CVPR12 | |
PASCAL VOC | The PASCAL Visual Object Classes (VOC) Challenge | IJCV10 | |
PASCAL VOC | The PASCAL Visual Object Classes Challenge: A Retrospective | IJCV15 | |
ImageNet | ImageNet: A Large-Scale Hierarchical Image Database | CVPR09 | |
ImageNet | ImageNet Large Scale Visual Recognition Challenge | IJCV15 | |
COCO | Microsoft COCO: Common Objects in Context | ECCV14 | |
Open Images | The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale | arXiv’ 18 | |
Object365 | |||
LVIS | LVIS: A Dataset for Large Vocabulary Instance Segmentation | CVPR19 |
参考:
【2019.9.20 更新中。。。】