目标检测最新方法介绍

https://handong1587.github.io/deep_learning/2015/10/09/nlp.html

Jump to...
  1. Leaderboard
  2. Papers
    1. R-CNN
    2. MultiBox
    3. SPP-Net
    4. DeepID-Net
    5. NoC
    6. Fast R-CNN
    7. DeepBox
    8. MR-CNN
    9. Faster R-CNN
    10. YOLO
    11. AttentionNet
    12. DenseBox
    13. SSD
    14. Inside-Outside Net (ION)
    15. G-CNN
    16. HyperNet
    17. MultiPathNet
    18. CRAFT
    19. OHEM
    20. R-FCN
    21. MS-CNN
    22. PVANET
    23. GBD-Net
    24. StuffNet
    25. Feature Pyramid Network (FPN)
  3. Detection From Video
    1. T-CNN
    2. Datasets
  4. Object Detection in 3D
  5. Salient Object Detection
  6. Specific Object Deteciton
    1. Face Deteciton
      1. UnitBox
      2. MTCNN
      3. Datasets / Benchmarks
    2. Facial Point / Landmark Detection
    3. People Detection
    4. Person Head Detection
    5. Pedestrian Detection
    6. Vehicle Detection
    7. Traffic-Sign Detection
    8. Boundary / Edge / Contour Detection
    9. Skeleton Detection
    10. Fruit Detection
    11. Others
  7. Object Proposal
  8. Localization
  9. Tutorials
  10. Projects
  11. Blogs
Method VOC2007 VOC2010 VOC2012 ILSVRC 2013 MSCOCO 2015 Speed
OverFeat       24.3%    
R-CNN (AlexNet) 58.5% 53.7% 53.3% 31.4%    
R-CNN (VGG16) 66.0%          
SPP_net(ZF-5) 54.2%(1-model), 60.9%(2-model)     31.84%(1-model), 35.11%(6-model)    
DeepID-Net 64.1%     50.3%    
NoC 73.3%   68.8%      
Fast-RCNN (VGG16) 70.0% 68.8% 68.4%   19.7%(@[0.5-0.95]), 35.9%(@0.5)  
MR-CNN 78.2%   73.9%      
Faster-RCNN (VGG16) 78.8%   75.9%   21.9%(@[0.5-0.95]), 42.7%(@0.5) 198ms
Faster-RCNN (ResNet-101) 85.6%   83.8%   37.4%(@[0.5-0.95]), 59.0%(@0.5)  
SSD300 (VGG16) 72.1%         58 fps
SSD500 (VGG16) 75.1%         23 fps
ION 79.2%   76.4%      
AZ-Net 70.4%       22.3%(@[0.5-0.95]), 41.0%(@0.5)  
CRAFT 75.7%   71.3% 48.5%    
OHEM 78.9%   76.3%   25.5%(@[0.5-0.95]), 45.9%(@0.5)  
R-FCN (ResNet-50) 77.4%         0.12sec(K40), 0.09sec(TitianX)
R-FCN (ResNet-101) 79.5%         0.17sec(K40), 0.12sec(TitianX)
R-FCN (ResNet-101),multi sc train 83.6%   82.0%   31.5%(@[0.5-0.95]), 53.2%(@0.5)  
PVANet 9.0 81.8%   82.5%     750ms(CPU), 46ms(TitianX)

Leaderboard

Detection Results: VOC2012

  • intro: Competition “comp4” (train on own data)
  • homepage: http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4

Papers

Deep Neural Networks for Object Detection

  • paper: http://papers.nips.cc/paper/5207-deep-neural-networks-for-object-detection.pdf

OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks

  • intro: A deep version of the sliding window method, predicts bounding box directly from each location of the topmost feature map after knowing the confidences of the underlying object categories.
  • intro: training a convolutional network to simultaneously classify, locate and detect objects in images can boost the classification accuracy and the detection and localization accuracy of all tasks
  • arxiv: http://arxiv.org/abs/1312.6229
  • github: https://github.com/sermanet/OverFeat
  • code: http://cilvr.nyu.edu/doku.php?id=software:overfeat:start

R-CNN

Rich feature hierarchies for accurate object detection and semantic segmentation

  • intro: R-CNN
  • arxiv: http://arxiv.org/abs/1311.2524
  • supp: http://people.eecs.berkeley.edu/~rbg/papers/r-cnn-cvpr-supp.pdf
  • slides: http://www.image-net.org/challenges/LSVRC/2013/slides/r-cnn-ilsvrc2013-workshop.pdf
  • slides: http://www.cs.berkeley.edu/~rbg/slides/rcnn-cvpr14-slides.pdf
  • github: https://github.com/rbgirshick/rcnn
  • notes: http://zhangliliang.com/2014/07/23/paper-note-rcnn/
  • caffe-pr(“Make R-CNN the Caffe detection example”):https://github.com/BVLC/caffe/pull/482

MultiBox

Scalable Object Detection using Deep Neural Networks

  • intro: MultiBox. Train a CNN to predict Region of Interest.
  • arxiv: http://arxiv.org/abs/1312.2249
  • github: https://github.com/google/multibox
  • blog: https://research.googleblog.com/2014/12/high-quality-object-detection-at-scale.html

Scalable, High-Quality Object Detection

  • intro: MultiBox
  • arxiv: http://arxiv.org/abs/1412.1441
  • github: https://github.com/google/multibox

SPP-Net

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

  • intro: ECCV 2014 / TPAMI 2015
  • arxiv: http://arxiv.org/abs/1406.4729
  • github: https://github.com/ShaoqingRen/SPP_net
  • notes: http://zhangliliang.com/2014/09/13/paper-note-sppnet/

Learning Rich Features from RGB-D Images for Object Detection and Segmentation

  • arxiv: http://arxiv.org/abs/1407.5736

DeepID-Net

DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection

  • intro: PAMI 2016
  • intro: an extension of R-CNN. box pre-training, cascade on region proposals, deformation layers and context representations
  • project page:http://www.ee.cuhk.edu.hk/%CB%9Cwlouyang/projects/imagenetDeepId/index.html
  • arxiv: http://arxiv.org/abs/1412.5661

Object Detectors Emerge in Deep Scene CNNs

  • arxiv: http://arxiv.org/abs/1412.6856
  • paper: https://www.robots.ox.ac.uk/~vgg/rg/papers/zhou_iclr15.pdf
  • paper: https://people.csail.mit.edu/khosla/papers/iclr2015_zhou.pdf
  • slides: http://places.csail.mit.edu/slide_iclr2015.pdf

segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection

  • intro: CVPR 2015
  • project(code+data): https://www.cs.toronto.edu/~yukun/segdeepm.html
  • arxiv: https://arxiv.org/abs/1502.04275
  • github: https://github.com/YknZhu/segDeepM

NoC

Object Detection Networks on Convolutional Feature Maps

  • intro: TPAMI 2015
  • arxiv: http://arxiv.org/abs/1504.06066

Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction

  • arxiv: http://arxiv.org/abs/1504.03293
  • slides: http://www.ytzhang.net/files/publications/2015-cvpr-det-slides.pdf
  • github: https://github.com/YutingZhang/fgs-obj

Fast R-CNN

Fast R-CNN

  • arxiv: http://arxiv.org/abs/1504.08083
  • slides: http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf
  • github: https://github.com/rbgirshick/fast-rcnn
  • webcam demo: https://github.com/rbgirshick/fast-rcnn/pull/29
  • notes: http://zhangliliang.com/2015/05/17/paper-note-fast-rcnn/
  • notes: http://blog.csdn.net/linj_m/article/details/48930179
  • github(“Fast R-CNN in MXNet”): https://github.com/precedenceguo/mx-rcnn
  • github: https://github.com/mahyarnajibi/fast-rcnn-torch
  • github: https://github.com/apple2373/chainer-simple-fast-rnn
  • github(Tensorflow): https://github.com/zplizzi/tensorflow-fast-rcnn

DeepBox

DeepBox: Learning Objectness with Convolutional Networks

  • arxiv: http://arxiv.org/abs/1505.02146
  • github: https://github.com/weichengkuo/DeepBox

MR-CNN

Object detection via a multi-region & semantic segmentation-aware CNN model

  • intro: ICCV 2015. MR-CNN
  • arxiv: http://arxiv.org/abs/1505.01749
  • github: https://github.com/gidariss/mrcnn-object-detection
  • notes: http://zhangliliang.com/2015/05/17/paper-note-ms-cnn/
  • notes: http://blog.cvmarcher.com/posts/2015/05/17/multi-region-semantic-segmentation-aware-cnn/
  • my notes: Who can tell me why there are a bunch of duplicated sentences in section 7.2 “Detection error analysis”? :-D

Faster R-CNN

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

  • intro: NIPS 2015
  • arxiv: http://arxiv.org/abs/1506.01497
  • gitxiv: http://www.gitxiv.com/posts/8pfpcvefDYn2gSgXk/faster-r-cnn-towards-real-time-object-detection-with-region
  • slides: http://web.cs.hacettepe.edu.tr/~aykut/classes/spring2016/bil722/slides/w05-FasterR-CNN.pdf
  • github: https://github.com/ShaoqingRen/faster_rcnn
  • github: https://github.com/rbgirshick/py-faster-rcnn
  • github: https://github.com/mitmul/chainer-faster-rcnn
  • github(Torch): https://github.com/andreaskoepf/faster-rcnn.torch
  • github(Torch): https://github.com/ruotianluo/Faster-RCNN-Densecap-torch
  • github(Tensorflow): https://github.com/smallcorgi/Faster-RCNN_TF
  • github(tensorflow): https://github.com/CharlesShang/TFFRCNN

Faster R-CNN in MXNet with distributed implementation and data parallelization

  • github: https://github.com/dmlc/mxnet/tree/master/example/rcnn

YOLO

You Only Look Once: Unified, Real-Time Object Detection

  • intro: YOLO uses the whole topmost feature map to predict both confidences for multiple categories and bounding boxes (which are shared for these categories).
  • arxiv: http://arxiv.org/abs/1506.02640
  • code: http://pjreddie.com/darknet/yolo/
  • github: https://github.com/pjreddie/darknet
  • reddit:https://www.reddit.com/r/MachineLearning/comments/3a3m0o/realtime_object_detection_with_yolo/
  • github: https://github.com/gliese581gg/YOLO_tensorflow
  • github: https://github.com/xingwangsfu/caffe-yolo
  • github: https://github.com/frankzhangrui/Darknet-Yolo
  • github: https://github.com/BriSkyHekun/py-darknet-yolo
  • github: https://github.com/tommy-qichang/yolo.torch
  • github: https://github.com/frischzenger/yolo-windows
  • gtihub: https://github.com/AlexeyAB/yolo-windows

Start Training YOLO with Our Own Data

  • intro: train with customized data and class numbers/labels. Linux / Windows version for darknet.
  • blog: http://guanghan.info/blog/en/my-works/train-yolo/
  • github: https://github.com/Guanghan/darknet

R-CNN minus R

  • arxiv: http://arxiv.org/abs/1506.06981

AttentionNet

AttentionNet: Aggregating Weak Directions for Accurate Object Detection

  • intro: ICCV 2015
  • intro: state-of-the-art performance of 65% (AP) on PASCAL VOC 2007/2012 human detection task
  • arxiv: http://arxiv.org/abs/1506.07704
  • slides: https://www.robots.ox.ac.uk/~vgg/rg/slides/AttentionNet.pdf
  • slides: http://image-net.org/challenges/talks/lunit-kaist-slide.pdf

DenseBox

DenseBox: Unifying Landmark Localization with End to End Object Detection

  • arxiv: http://arxiv.org/abs/1509.04874
  • demo: http://pan.baidu.com/s/1mgoWWsS
  • KITTI result: http://www.cvlibs.net/datasets/kitti/eval_object.php

SSD

SSD: Single Shot MultiBox Detector

  • arxiv: http://arxiv.org/abs/1512.02325
  • paper: http://www.cs.unc.edu/~wliu/papers/ssd.pdf
  • github: https://github.com/weiliu89/caffe/tree/ssd
  • video: http://weibo.com/p/2304447a2326da963254c963c97fb05dd3a973
  • github(MXNet): https://github.com/zhreshold/mxnet-ssd
  • github: https://github.com/zhreshold/mxnet-ssd.cpp
  • github(Keras): https://github.com/rykov8/ssd_keras

为什么SSD(Single Shot MultiBox Detector)对小目标的检测效果不好?

  • zhihu: https://www.zhihu.com/question/49455386

Inside-Outside Net (ION)

Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks

  • intro: “0.8s per image on a Titan X GPU (excluding proposal generation) without two-stage bounding-box regression and 1.15s per image with it”.
  • arxiv: http://arxiv.org/abs/1512.04143
  • slides: http://www.seanbell.ca/tmp/ion-coco-talk-bell2015.pdf
  • coco-leaderboard: http://mscoco.org/dataset/#detections-leaderboard

Adaptive Object Detection Using Adjacency and Zoom Prediction

  • intro: CVPR 2016. AZ-Net
  • arxiv: http://arxiv.org/abs/1512.07711
  • github: https://github.com/luyongxi/az-net
  • youtube: https://www.youtube.com/watch?v=YmFtuNwxaNM

G-CNN

G-CNN: an Iterative Grid Based Object Detector

  • arxiv: http://arxiv.org/abs/1512.07729

Factors in Finetuning Deep Model for object detection Factors in Finetuning Deep Model for Object Detection with Long-tail Distribution

  • intro: CVPR 2016.rank 3rd for provided data and 2nd for external data on ILSVRC 2015 object detection
  • project page:http://www.ee.cuhk.edu.hk/~wlouyang/projects/ImageNetFactors/CVPR16.html
  • arxiv: http://arxiv.org/abs/1601.05150

We don’t need no bounding-boxes: Training object class detectors using only human verification

  • arxiv: http://arxiv.org/abs/1602.08405

HyperNet

HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection

  • arxiv: http://arxiv.org/abs/1604.00600

MultiPathNet

A MultiPath Network for Object Detection

  • intro: BMVC 2016. Facebook AI Research (FAIR)
  • arxiv: http://arxiv.org/abs/1604.02135
  • github: https://github.com/facebookresearch/multipathnet

CRAFT

CRAFT Objects from Images

  • intro: CVPR 2016. Cascade Region-proposal-network And FasT-rcnn. an extension of Faster R-CNN
  • project page: http://byangderek.github.io/projects/craft.html
  • arxiv: https://arxiv.org/abs/1604.03239
  • paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Yang_CRAFT_Objects_From_CVPR_2016_paper.pdf
  • github: https://github.com/byangderek/CRAFT

OHEM

Training Region-based Object Detectors with Online Hard Example Mining

  • intro: CVPR 2016 Oral. Online hard example mining (OHEM)
  • arxiv: http://arxiv.org/abs/1604.03540
  • paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Shrivastava_Training_Region-Based_Object_CVPR_2016_paper.pdf

Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection

  • intro: CVPR 2016
  • arxiv: http://arxiv.org/abs/1604.05766

Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers

http://www-personal.umich.edu/~wgchoi/SDP-CRC_camready.pdf

R-FCN

R-FCN: Object Detection via Region-based Fully Convolutional Networks

  • arxiv: http://arxiv.org/abs/1605.06409
  • github: https://github.com/daijifeng001/R-FCN
  • github: https://github.com/Orpine/py-R-FCN

Weakly supervised object detection using pseudo-strong labels

  • arxiv: http://arxiv.org/abs/1607.04731

Recycle deep features for better object detection

  • arxiv: http://arxiv.org/abs/1607.05066

MS-CNN

A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection

  • intro: ECCV 2016
  • intro: 640×480: 15 fps, 960×720: 8 fps
  • arxiv: http://arxiv.org/abs/1607.07155
  • github: https://github.com/zhaoweicai/mscnn
  • poster: http://www.eccv2016.org/files/posters/P-2B-38.pdf

Multi-stage Object Detection with Group Recursive Learning

  • intro: VOC2007: 78.6%, VOC2012: 74.9%
  • arxiv: http://arxiv.org/abs/1608.05159

Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection

  • intro: SubCNN
  • arxiv: http://arxiv.org/abs/1604.04693
  • github: https://github.com/yuxng/SubCNN

PVANET

PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection

  • intro: “less channels with more layers”, concatenated ReLU, Inception, and HyperNet, batch normalization, residual connections
  • arxiv: http://arxiv.org/abs/1608.08021
  • github: https://github.com/sanghoon/pva-faster-rcnn
  • leaderboard(PVANet 9.0): http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4

PVANet: Lightweight Deep Neural Networks for Real-time Object Detection

  • intro: Presented at NIPS 2016 Workshop on Efficient Methods for Deep Neural Networks (EMDNN). Continuation of arXiv:1608.08021
  • arxiv: https://arxiv.org/abs/1611.08588

GBD-Net

Gated Bi-directional CNN for Object Detection

  • intro: The Chinese University of Hong Kong & Sensetime Group Limited
  • paper: http://link.springer.com/chapter/10.1007/978-3-319-46478-7_22
  • mirror: https://pan.baidu.com/s/1dFohO7v

Crafting GBD-Net for Object Detection

  • intro: winner of the ImageNet object detection challenge of 2016. CUImage and CUVideo
  • intro: gated bi-directional CNN (GBD-Net)
  • arxiv: https://arxiv.org/abs/1610.02579
  • github: https://github.com/craftGBD/craftGBD

StuffNet

StuffNet: Using ‘Stuff’ to Improve Object Detection

  • arxiv: https://arxiv.org/abs/1610.05861

Generalized Haar Filter based Deep Networks for Real-Time Object Detection in Traffic Scene

  • arxiv: https://arxiv.org/abs/1610.09609

Hierarchical Object Detection with Deep Reinforcement Learning

  • intro: Deep Reinforcement Learning Workshop (NIPS 2016)
  • project page: https://imatge-upc.github.io/detection-2016-nipsws/
  • arxiv: https://arxiv.org/abs/1611.03718
  • github: https://github.com/imatge-upc/detection-2016-nipsws

Learning to detect and localize many objects from few examples

  • arxiv: https://arxiv.org/abs/1611.05664

Speed/accuracy trade-offs for modern convolutional object detectors

  • intro: Google Research
  • arxiv: https://arxiv.org/abs/1611.10012

SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving

  • arxiv: https://arxiv.org/abs/1612.01051

Feature Pyramid Network (FPN)

Feature Pyramid Networks for Object Detection

  • intro: Facebook AI Research
  • arxiv: https://arxiv.org/abs/1612.03144

Detection From Video

Learning Object Class Detectors from Weakly Annotated Video

  • intro: CVPR 2012
  • paper:https://www.vision.ee.ethz.ch/publications/papers/proceedings/eth_biwi_00905.pdf

Analysing domain shift factors between videos and images for object detection

  • arxiv: https://arxiv.org/abs/1501.01186

Video Object Recognition

  • slides:http://vision.princeton.edu/courses/COS598/2015sp/slides/VideoRecog/Video%20Object%20Recognition.pptx

Deep Learning for Saliency Prediction in Natural Video

  • intro: Submitted on 12 Jan 2016
  • keywords: Deep learning, saliency map, optical flow, convolution network, contrast features
  • paper: https://hal.archives-ouvertes.fr/hal-01251614/document

T-CNN

T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos

  • intro: Winning solution in ILSVRC2015 Object Detection from Video(VID) Task
  • arxiv: http://arxiv.org/abs/1604.02532
  • github: https://github.com/myfavouritekk/T-CNN

Object Detection from Video Tubelets with Convolutional Neural Networks

  • intro: CVPR 2016 Spotlight paper
  • arxiv: https://arxiv.org/abs/1604.04053
  • paper: http://www.ee.cuhk.edu.hk/~wlouyang/Papers/KangVideoDet_CVPR16.pdf
  • gihtub: https://github.com/myfavouritekk/vdetlib

Object Detection in Videos with Tubelets and Multi-context Cues

  • intro: SenseTime Group
  • slides: http://www.ee.cuhk.edu.hk/~xgwang/CUvideo.pdf
  • slides: http://image-net.org/challenges/talks/Object%20Detection%20in%20Videos%20with%20Tubelets%20and%20Multi-context%20Cues%20-%20Final.pdf

Context Matters: Refining Object Detection in Video with Recurrent Neural Networks

  • intro: BMVC 2016
  • keywords: pseudo-labeler
  • arxiv: http://arxiv.org/abs/1607.04648
  • paper: http://vision.cornell.edu/se3/wp-content/uploads/2016/07/video_object_detection_BMVC.pdf

CNN Based Object Detection in Large Video Images

  • intro: WangTao @ 爱奇艺
  • keywords: object retrieval, object detection, scene classification
  • slides: http://on-demand.gputechconf.com/gtc/2016/presentation/s6362-wang-tao-cnn-based-object-detection-large-video-images.pdf

Datasets

YouTube-Objects dataset v2.2

  • homepage: http://calvin.inf.ed.ac.uk/datasets/youtube-objects-dataset/

ILSVRC2015: Object detection from video (VID)

  • homepage: http://vision.cs.unc.edu/ilsvrc2015/download-videos-3j16.php#vid

Object Detection in 3D

Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks

  • arxiv: https://arxiv.org/abs/1609.06666

Salient Object Detection

This task involves predicting the salient regions of an image given by human eye fixations.

Best Deep Saliency Detection Models (CVPR 2016 & 2015)

http://i.cs.hku.hk/~yzyu/vision.html

Large-scale optimization of hierarchical features for saliency prediction in natural images

  • paper: http://coxlab.org/pdfs/cvpr2014_vig_saliency.pdf

Predicting Eye Fixations using Convolutional Neural Networks

  • paper: http://www.escience.cn/system/file?fileId=72648

Saliency Detection by Multi-Context Deep Learning

  • paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Zhao_Saliency_Detection_by_2015_CVPR_paper.pdf

DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection

  • arxiv: http://arxiv.org/abs/1510.05484

SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection

  • paper: www.shengfenghe.com/supercnn-a-superpixelwise-convolutional-neural-network-for-salient-object-detection.html

Shallow and Deep Convolutional Networks for Saliency Prediction

  • arxiv: http://arxiv.org/abs/1603.00845
  • github: https://github.com/imatge-upc/saliency-2016-cvpr

Recurrent Attentional Networks for Saliency Detection

  • intro: CVPR 2016. recurrent attentional convolutional-deconvolution network (RACDNN)
  • arxiv: http://arxiv.org/abs/1604.03227

Two-Stream Convolutional Networks for Dynamic Saliency Prediction

  • arxiv: http://arxiv.org/abs/1607.04730

Unconstrained Salient Object Detection

Unconstrained Salient Object Detection via Proposal Subset Optimization

  • intro: CVPR 2016
  • project page: http://cs-people.bu.edu/jmzhang/sod.html
  • paper: http://cs-people.bu.edu/jmzhang/SOD/CVPR16SOD_camera_ready.pdf
  • github: https://github.com/jimmie33/SOD
  • caffe model zoo: https://github.com/BVLC/caffe/wiki/Model-Zoo#cnn-object-proposal-models-for-salient-object-detection

DHSNet: Deep Hierarchical Saliency Network for Salient Object Detection

  • paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Liu_DHSNet_Deep_Hierarchical_CVPR_2016_paper.pdf

Salient Object Subitizing

  • intro: CVPR 2015
  • intro: predicting the existence and the number of salient objects in an image using holistic cues
  • project page: http://cs-people.bu.edu/jmzhang/sos.html
  • arxiv: http://arxiv.org/abs/1607.07525
  • paper: http://cs-people.bu.edu/jmzhang/SOS/SOS_preprint.pdf
  • caffe model zoo: https://github.com/BVLC/caffe/wiki/Model-Zoo#cnn-models-for-salient-object-subitizing

Deeply-Supervised Recurrent Convolutional Neural Network for Saliency Detection

  • intro: ACMMM 2016. deeply-supervised recurrent convolutional neural network (DSRCNN)
  • arxiv: http://arxiv.org/abs/1608.05177

Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs

  • intro: ECCV 2016
  • arxiv: http://arxiv.org/abs/1608.05186

Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection

  • arxiv: http://arxiv.org/abs/1608.08029

A Deep Multi-Level Network for Saliency Prediction

  • arxiv: http://arxiv.org/abs/1609.01064

Visual Saliency Detection Based on Multiscale Deep CNN Features

  • intro: IEEE Transactions on Image Processing
  • arxiv: http://arxiv.org/abs/1609.02077

A Deep Spatial Contextual Long-term Recurrent Convolutional Network for Saliency Detection

  • intro: DSCLRCN
  • arxiv: https://arxiv.org/abs/1610.01708

Deeply supervised salient object detection with short connections

  • arxiv: https://arxiv.org/abs/1611.04849

Weakly Supervised Top-down Salient Object Detection

  • intro: Nanyang Technological University
  • arxiv: https://arxiv.org/abs/1611.05345

Specific Object Deteciton

Face Deteciton

Multi-view Face Detection Using Deep Convolutional Neural Networks

  • intro: Yahoo
  • arxiv: http://arxiv.org/abs/1502.02766

From Facial Parts Responses to Face Detection: A Deep Learning Approach

  • project page: http://personal.ie.cuhk.edu.hk/~ys014/projects/Faceness/Faceness.html

Compact Convolutional Neural Network Cascade for Face Detection

  • arxiv: http://arxiv.org/abs/1508.01292
  • github: https://github.com/Bkmz21/FD-Evaluation

Face Detection with End-to-End Integration of a ConvNet and a 3D Model

  • intro: ECCV 2016
  • arxiv: https://arxiv.org/abs/1606.00850
  • github(MXNet): https://github.com/tfwu/FaceDetection-ConvNet-3D

Supervised Transformer Network for Efficient Face Detection

  • arxiv: http://arxiv.org/abs/1607.05477

UnitBox

UnitBox: An Advanced Object Detection Network

  • intro: ACM MM 2016
  • arxiv: http://arxiv.org/abs/1608.01471

Bootstrapping Face Detection with Hard Negative Examples

  • author: 万韶华 @ 小米.
  • intro: Faster R-CNN, hard negative mining. state-of-the-art on the FDDB dataset
  • arxiv: http://arxiv.org/abs/1608.02236

Grid Loss: Detecting Occluded Faces

  • intro: ECCV 2016
  • arxiv: https://arxiv.org/abs/1609.00129
  • paper: http://lrs.icg.tugraz.at/pubs/opitz_eccv_16.pdf
  • poster: http://www.eccv2016.org/files/posters/P-2A-34.pdf

A Multi-Scale Cascade Fully Convolutional Network Face Detector

  • intro: ICPR 2016
  • arxiv: http://arxiv.org/abs/1609.03536

MTCNN

Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks

Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks

  • project page: https://kpzhang93.github.io/MTCNN_face_detection_alignment/index.html
  • arxiv: https://arxiv.org/abs/1604.02878
  • github(Matlab): https://github.com/kpzhang93/MTCNN_face_detection_alignment
  • github(MXNet): https://github.com/pangyupo/mxnet_mtcnn_face_detection
  • github: https://github.com/DaFuCoding/MTCNN_Caffe

Datasets / Benchmarks

FDDB: Face Detection Data Set and Benchmark

  • homepage: http://vis-www.cs.umass.edu/fddb/index.html
  • results: http://vis-www.cs.umass.edu/fddb/results.html

WIDER FACE: A Face Detection Benchmark

  • homepage: http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/
  • arxiv: http://arxiv.org/abs/1511.06523

Facial Point / Landmark Detection

Deep Convolutional Network Cascade for Facial Point Detection

  • homepage: http://mmlab.ie.cuhk.edu.hk/archive/CNN_FacePoint.htm
  • paper: http://www.ee.cuhk.edu.hk/~xgwang/papers/sunWTcvpr13.pdf
  • github: https://github.com/luoyetx/deep-landmark

A Recurrent Encoder-Decoder Network for Sequential Face Alignment

  • intro: ECCV 2016
  • arxiv: https://arxiv.org/abs/1608.05477

Detecting facial landmarks in the video based on a hybrid framework

  • arxiv: http://arxiv.org/abs/1609.06441

Deep Constrained Local Models for Facial Landmark Detection

  • arxiv: https://arxiv.org/abs/1611.08657

People Detection

End-to-end people detection in crowded scenes

  • arxiv: http://arxiv.org/abs/1506.04878
  • github: https://github.com/Russell91/reinspect
  • ipn:http://nbviewer.ipython.org/github/Russell91/ReInspect/blob/master/evaluation_reinspect.ipynb

Detecting People in Artwork with CNNs

  • intro: ECCV 2016 Workshops
  • arxiv: https://arxiv.org/abs/1610.08871

Person Head Detection

Context-aware CNNs for person head detection

  • arxiv: http://arxiv.org/abs/1511.07917
  • github: https://github.com/aosokin/cnn_head_detection

Pedestrian Detection

Pedestrian Detection aided by Deep Learning Semantic Tasks

  • intro: CVPR 2015
  • project page: http://mmlab.ie.cuhk.edu.hk/projects/TA-CNN/
  • paper: http://arxiv.org/abs/1412.0069

Deep Learning Strong Parts for Pedestrian Detection

  • intro: ICCV 2015. CUHK. DeepParts
  • intro: Achieving 11.89% average miss rate on Caltech Pedestrian Dataset
  • paper: http://personal.ie.cuhk.edu.hk/~pluo/pdf/tianLWTiccv15.pdf

Deep convolutional neural networks for pedestrian detection

  • arxiv: http://arxiv.org/abs/1510.03608
  • github: https://github.com/DenisTome/DeepPed

New algorithm improves speed and accuracy of pedestrian detection

  • blog: http://www.eurekalert.org/pub_releases/2016-02/uoc–nai020516.php

Pushing the Limits of Deep CNNs for Pedestrian Detection

  • intro: “set a new record on the Caltech pedestrian dataset, lowering the log-average miss rate from 11.7% to 8.9%”
  • arxiv: http://arxiv.org/abs/1603.04525

A Real-Time Deep Learning Pedestrian Detector for Robot Navigation

  • arxiv: http://arxiv.org/abs/1607.04436

A Real-Time Pedestrian Detector using Deep Learning for Human-Aware Navigation

  • arxiv: http://arxiv.org/abs/1607.04441

Is Faster R-CNN Doing Well for Pedestrian Detection?

  • arxiv: http://arxiv.org/abs/1607.07032
  • github: https://github.com/zhangliliang/RPN_BF/tree/RPN-pedestrian

Reduced Memory Region Based Deep Convolutional Neural Network Detection

  • intro: IEEE 2016 ICCE-Berlin
  • arxiv: http://arxiv.org/abs/1609.02500

Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection

  • arxiv: https://arxiv.org/abs/1610.03466

Multispectral Deep Neural Networks for Pedestrian Detection

  • intro: BMVC 2016 oral
  • arxiv: https://arxiv.org/abs/1611.02644

Vehicle Detection

DAVE: A Unified Framework for Fast Vehicle Detection and Annotation

  • intro: ECCV 2016
  • arxiv: http://arxiv.org/abs/1607.04564

Traffic-Sign Detection

Traffic-Sign Detection and Classification in the Wild

  • project page(code+dataset): http://cg.cs.tsinghua.edu.cn/traffic-sign/
  • paper: http://120.52.73.11/www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Zhu_Traffic-Sign_Detection_and_CVPR_2016_paper.pdf
  • code & model: http://cg.cs.tsinghua.edu.cn/traffic-sign/data_model_code/newdata0411.zip

Boundary / Edge / Contour Detection

Holistically-Nested Edge Detection

  • intro: ICCV 2015, Marr Prize
  • paper: http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Xie_Holistically-Nested_Edge_Detection_ICCV_2015_paper.pdf
  • arxiv: http://arxiv.org/abs/1504.06375
  • github: https://github.com/s9xie/hed

Unsupervised Learning of Edges

  • intro: CVPR 2016. Facebook AI Research
  • arxiv: http://arxiv.org/abs/1511.04166
  • zn-blog: http://www.leiphone.com/news/201607/b1trsg9j6GSMnjOP.html

Pushing the Boundaries of Boundary Detection using Deep Learning

  • arxiv: http://arxiv.org/abs/1511.07386

Convolutional Oriented Boundaries

  • intro: ECCV 2016
  • arxiv: http://arxiv.org/abs/1608.02755

Richer Convolutional Features for Edge Detection

  • intro: richer convolutional features (RCF)
  • arxiv: https://arxiv.org/abs/1612.02103

Skeleton Detection

Object Skeleton Extraction in Natural Images by Fusing Scale-associated Deep Side Outputs

  • arxiv: http://arxiv.org/abs/1603.09446
  • github: https://github.com/zeakey/DeepSkeleton

DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images

  • arxiv: http://arxiv.org/abs/1609.03659

Fruit Detection

Deep Fruit Detection in Orchards

  • arxiv: https://arxiv.org/abs/1610.03677

Image Segmentation for Fruit Detection and Yield Estimation in Apple Orchards

  • intro: The Journal of Field Robotics in May 2016
  • project page: http://confluence.acfr.usyd.edu.au/display/AGPub/
  • arxiv: https://arxiv.org/abs/1610.08120

Others

Deep Deformation Network for Object Landmark Localization

  • arxiv: http://arxiv.org/abs/1605.01014

Fashion Landmark Detection in the Wild

  • arxiv: http://arxiv.org/abs/1608.03049

Deep Learning for Fast and Accurate Fashion Item Detection

  • intro: Kuznech Inc.
  • intro: MultiBox and Fast R-CNN
  • paper:https://kddfashion2016.mybluemix.net/kddfashion_finalSubmissions/Deep%20Learning%20for%20Fast%20and%20Accurate%20Fashion%20Item%20Detection.pdf

Visual Relationship Detection with Language Priors

  • intro: ECCV 2016 oral
  • paper: https://cs.stanford.edu/people/ranjaykrishna/vrd/vrd.pdf
  • github: https://github.com/Prof-Lu-Cewu/Visual-Relationship-Detection

OSMDeepOD - OSM and Deep Learning based Object Detection from Aerial Imagery (formerly known as “OSM-Crosswalk-Detection”)

  • github: https://github.com/geometalab/OSMDeepOD

Selfie Detection by Synergy-Constraint Based Convolutional Neural Network

  • intro: IEEE SITIS 2016
  • arxiv: https://arxiv.org/abs/1611.04357

Associative Embedding:End-to-End Learning for Joint Detection and Grouping

  • arxiv: https://arxiv.org/abs/1611.05424

Deep Cuboid Detection: Beyond 2D Bounding Boxes

  • intro: CMU & Magic Leap
  • arxiv: https://arxiv.org/abs/1611.10010

Object Proposal

DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers

  • arxiv: http://arxiv.org/abs/1510.04445
  • github: https://github.com/aghodrati/deepproposal

Scale-aware Pixel-wise Object Proposal Networks

  • intro: IEEE Transactions on Image Processing
  • arxiv: http://arxiv.org/abs/1601.04798

Attend Refine Repeat: Active Box Proposal Generation via In-Out Localization

  • intro: AttractioNet
  • arxiv: https://arxiv.org/abs/1606.04446
  • github: https://github.com/gidariss/AttractioNet

Learning to Segment Object Proposals via Recursive Neural Networks

  • arxiv: https://arxiv.org/abs/1612.01057

Localization

Beyond Bounding Boxes: Precise Localization of Objects in Images

  • intro: PhD Thesis
  • homepage: http://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-193.html
  • phd-thesis: http://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-193.pdf
  • github(“SDS using hypercolumns”): https://github.com/bharath272/sds

Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning

  • arxiv: http://arxiv.org/abs/1503.00949

Weakly Supervised Object Localization Using Size Estimates

  • arxiv: http://arxiv.org/abs/1608.04314

Localizing objects using referring expressions

  • intro: ECCV 2016
  • keywords: LSTM, multiple instance learning (MIL)
  • paper: http://www.umiacs.umd.edu/~varun/files/refexp-ECCV16.pdf
  • github: https://github.com/varun-nagaraja/referring-expressions

LocNet: Improving Localization Accuracy for Object Detection

  • arxiv: http://arxiv.org/abs/1511.07763
  • github: https://github.com/gidariss/LocNet

Learning Deep Features for Discriminative Localization

  • homepage: http://cnnlocalization.csail.mit.edu/
  • arxiv: http://arxiv.org/abs/1512.04150
  • github(Tensorflow): https://github.com/jazzsaxmafia/Weakly_detector
  • github: https://github.com/metalbubble/CAM
  • github: https://github.com/tdeboissiere/VGG16CAM-keras

ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization

  • intro: ECCV 2016
  • project page: http://www.di.ens.fr/willow/research/contextlocnet/
  • arxiv: http://arxiv.org/abs/1609.04331
  • github: https://github.com/vadimkantorov/contextlocnet

Tutorials

Convolutional Feature Maps: Elements of efficient (and accurate) CNN-based object detection

  • slides: http://research.microsoft.com/en-us/um/people/kahe/iccv15tutorial/iccv2015_tutorial_convolutional_feature_maps_kaiminghe.pdf

Projects

TensorBox: a simple framework for training neural networks to detect objects in images

  • intro: “The basic model implements the simple and robust GoogLeNet-OverFeat algorithm. We additionally provide an implementation of the ReInspect algorithm”
  • github: https://github.com/Russell91/TensorBox

Object detection in torch: Implementation of some object detection frameworks in torch

  • github: https://github.com/fmassa/object-detection.torch

Using DIGITS to train an Object Detection network

  • github: https://github.com/NVIDIA/DIGITS/blob/master/examples/object-detection/README.md

FCN-MultiBox Detector

  • intro: Full convolution MultiBox Detector ( like SSD) implemented in Torch.
  • github: https://github.com/teaonly/FMD.torch

Blogs

Convolutional Neural Networks for Object Detection

http://rnd.azoft.com/convolutional-neural-networks-object-detection/

Introducing automatic object detection to visual search (Pinterest)

  • keywords: Faster R-CNN
  • blog: https://engineering.pinterest.com/blog/introducing-automatic-object-detection-visual-search
  • demo:https://engineering.pinterest.com/sites/engineering/files/Visual%20Search%20V1%20-%20Video.mp4
  • review: https://news.developer.nvidia.com/pinterest-introduces-the-future-of-visual-search/?mkt_tok=eyJpIjoiTnpaa01UWXpPRE0xTURFMiIsInQiOiJJRjcybjkwTmtmallORUhLOFFFODBDclFqUlB3SWlRVXJXb1MrQ013TDRIMGxLQWlBczFIeWg0TFRUdnN2UHY2ZWFiXC9QQVwvQzBHM3B0UzBZblpOSmUyU1FcLzNPWXI4cml2VERwTTJsOFwvOEk9In0%3D

Deep Learning for Object Detection with DIGITS

  • blog: https://devblogs.nvidia.com/parallelforall/deep-learning-object-detection-digits/

Analyzing The Papers Behind Facebook’s Computer Vision Approach

  • keywords: DeepMask, SharpMask, MultiPathNet
  • blog: https://adeshpande3.github.io/adeshpande3.github.io/Analyzing-the-Papers-Behind-Facebook’s-Computer-Vision-Approach/

**Easily Create High Quality Object Detectors with Deep Learning **

  • intro: dlib v19.2
  • blog: http://blog.dlib.net/2016/10/easily-create-high-quality-object.html

How to Train a Deep-Learned Object Detection Model in the Microsoft Cognitive Toolkit

  • blog: https://blogs.technet.microsoft.com/machinelearning/2016/10/25/how-to-train-a-deep-learned-object-detection-model-in-cntk/
  • github:https://github.com/Microsoft/CNTK/tree/master/Examples/Image/Detection/FastRCNN

Object Detection in Satellite Imagery, a Low Overhead Approach

  • part 1: https://medium.com/the-downlinq/object-detection-in-satellite-imagery-a-low-overhead-approach-part-i-cbd96154a1b7#.2csh4iwx9
  • part 2: https://medium.com/the-downlinq/object-detection-in-satellite-imagery-a-low-overhead-approach-part-ii-893f40122f92#.f9b7dgf64

You Only Look Twice — Multi-Scale Object Detection in Satellite Imagery With Convolutional Neural Networks

  • part 1: https://medium.com/the-downlinq/you-only-look-twice-multi-scale-object-detection-in-satellite-imagery-with-convolutional-neural-38dad1cf7571#.fmmi2o3of
  • part 2: https://medium.com/the-downlinq/you-only-look-twice-multi-scale-object-detection-in-satellite-imagery-with-convolutional-neural-34f72f659588#.nwzarsz1t

Faster R-CNN Pedestrian and Car Detection

  • blog: https://bigsnarf.wordpress.com/2016/11/07/faster-r-cnn-pedestrian-and-car-detection/
  • ipn: https://gist.github.com/bigsnarfdude/2f7b2144065f6056892a98495644d3e0#file-demo_faster_rcnn_notebook-ipynb
  • github: https://github.com/bigsnarfdude/Faster-RCNN_TF

你可能感兴趣的:(目标检测)