https://handong1587.github.io/deep_learning/2015/10/09/nlp.html
Jump to...
- Leaderboard
- Papers
- R-CNN
- MultiBox
- SPP-Net
- DeepID-Net
- NoC
- Fast R-CNN
- DeepBox
- MR-CNN
- Faster R-CNN
- YOLO
- AttentionNet
- DenseBox
- SSD
- Inside-Outside Net (ION)
- G-CNN
- HyperNet
- MultiPathNet
- CRAFT
- OHEM
- R-FCN
- MS-CNN
- PVANET
- GBD-Net
- StuffNet
- Feature Pyramid Network (FPN)
- Detection From Video
- T-CNN
- Datasets
- Object Detection in 3D
- Salient Object Detection
- Specific Object Deteciton
- Face Deteciton
- UnitBox
- MTCNN
- Datasets / Benchmarks
- Facial Point / Landmark Detection
- People Detection
- Person Head Detection
- Pedestrian Detection
- Vehicle Detection
- Traffic-Sign Detection
- Boundary / Edge / Contour Detection
- Skeleton Detection
- Fruit Detection
- Others
- Object Proposal
- Localization
- Tutorials
- Projects
- Blogs
Method |
VOC2007 |
VOC2010 |
VOC2012 |
ILSVRC 2013 |
MSCOCO 2015 |
Speed |
OverFeat |
|
|
|
24.3% |
|
|
R-CNN (AlexNet) |
58.5% |
53.7% |
53.3% |
31.4% |
|
|
R-CNN (VGG16) |
66.0% |
|
|
|
|
|
SPP_net(ZF-5) |
54.2%(1-model), 60.9%(2-model) |
|
|
31.84%(1-model), 35.11%(6-model) |
|
|
DeepID-Net |
64.1% |
|
|
50.3% |
|
|
NoC |
73.3% |
|
68.8% |
|
|
|
Fast-RCNN (VGG16) |
70.0% |
68.8% |
68.4% |
|
19.7%(@[0.5-0.95]), 35.9%(@0.5) |
|
MR-CNN |
78.2% |
|
73.9% |
|
|
|
Faster-RCNN (VGG16) |
78.8% |
|
75.9% |
|
21.9%(@[0.5-0.95]), 42.7%(@0.5) |
198ms |
Faster-RCNN (ResNet-101) |
85.6% |
|
83.8% |
|
37.4%(@[0.5-0.95]), 59.0%(@0.5) |
|
SSD300 (VGG16) |
72.1% |
|
|
|
|
58 fps |
SSD500 (VGG16) |
75.1% |
|
|
|
|
23 fps |
ION |
79.2% |
|
76.4% |
|
|
|
AZ-Net |
70.4% |
|
|
|
22.3%(@[0.5-0.95]), 41.0%(@0.5) |
|
CRAFT |
75.7% |
|
71.3% |
48.5% |
|
|
OHEM |
78.9% |
|
76.3% |
|
25.5%(@[0.5-0.95]), 45.9%(@0.5) |
|
R-FCN (ResNet-50) |
77.4% |
|
|
|
|
0.12sec(K40), 0.09sec(TitianX) |
R-FCN (ResNet-101) |
79.5% |
|
|
|
|
0.17sec(K40), 0.12sec(TitianX) |
R-FCN (ResNet-101),multi sc train |
83.6% |
|
82.0% |
|
31.5%(@[0.5-0.95]), 53.2%(@0.5) |
|
PVANet 9.0 |
81.8% |
|
82.5% |
|
|
750ms(CPU), 46ms(TitianX) |
Detection Results: VOC2012
- intro: Competition “comp4” (train on own data)
- homepage: http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4
Deep Neural Networks for Object Detection
- paper: http://papers.nips.cc/paper/5207-deep-neural-networks-for-object-detection.pdf
OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks
- intro: A deep version of the sliding window method, predicts bounding box directly from each location of the topmost feature map after knowing the confidences of the underlying object categories.
- intro: training a convolutional network to simultaneously classify, locate and detect objects in images can boost the classification accuracy and the detection and localization accuracy of all tasks
- arxiv: http://arxiv.org/abs/1312.6229
- github: https://github.com/sermanet/OverFeat
- code: http://cilvr.nyu.edu/doku.php?id=software:overfeat:start
Rich feature hierarchies for accurate object detection and semantic segmentation
- intro: R-CNN
- arxiv: http://arxiv.org/abs/1311.2524
- supp: http://people.eecs.berkeley.edu/~rbg/papers/r-cnn-cvpr-supp.pdf
- slides: http://www.image-net.org/challenges/LSVRC/2013/slides/r-cnn-ilsvrc2013-workshop.pdf
- slides: http://www.cs.berkeley.edu/~rbg/slides/rcnn-cvpr14-slides.pdf
- github: https://github.com/rbgirshick/rcnn
- notes: http://zhangliliang.com/2014/07/23/paper-note-rcnn/
- caffe-pr(“Make R-CNN the Caffe detection example”):https://github.com/BVLC/caffe/pull/482
Scalable Object Detection using Deep Neural Networks
- intro: MultiBox. Train a CNN to predict Region of Interest.
- arxiv: http://arxiv.org/abs/1312.2249
- github: https://github.com/google/multibox
- blog: https://research.googleblog.com/2014/12/high-quality-object-detection-at-scale.html
Scalable, High-Quality Object Detection
- intro: MultiBox
- arxiv: http://arxiv.org/abs/1412.1441
- github: https://github.com/google/multibox
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
- intro: ECCV 2014 / TPAMI 2015
- arxiv: http://arxiv.org/abs/1406.4729
- github: https://github.com/ShaoqingRen/SPP_net
- notes: http://zhangliliang.com/2014/09/13/paper-note-sppnet/
Learning Rich Features from RGB-D Images for Object Detection and Segmentation
- arxiv: http://arxiv.org/abs/1407.5736
DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection
- intro: PAMI 2016
- intro: an extension of R-CNN. box pre-training, cascade on region proposals, deformation layers and context representations
- project page:http://www.ee.cuhk.edu.hk/%CB%9Cwlouyang/projects/imagenetDeepId/index.html
- arxiv: http://arxiv.org/abs/1412.5661
Object Detectors Emerge in Deep Scene CNNs
- arxiv: http://arxiv.org/abs/1412.6856
- paper: https://www.robots.ox.ac.uk/~vgg/rg/papers/zhou_iclr15.pdf
- paper: https://people.csail.mit.edu/khosla/papers/iclr2015_zhou.pdf
- slides: http://places.csail.mit.edu/slide_iclr2015.pdf
segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection
- intro: CVPR 2015
- project(code+data): https://www.cs.toronto.edu/~yukun/segdeepm.html
- arxiv: https://arxiv.org/abs/1502.04275
- github: https://github.com/YknZhu/segDeepM
Object Detection Networks on Convolutional Feature Maps
- intro: TPAMI 2015
- arxiv: http://arxiv.org/abs/1504.06066
Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction
- arxiv: http://arxiv.org/abs/1504.03293
- slides: http://www.ytzhang.net/files/publications/2015-cvpr-det-slides.pdf
- github: https://github.com/YutingZhang/fgs-obj
Fast R-CNN
- arxiv: http://arxiv.org/abs/1504.08083
- slides: http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf
- github: https://github.com/rbgirshick/fast-rcnn
- webcam demo: https://github.com/rbgirshick/fast-rcnn/pull/29
- notes: http://zhangliliang.com/2015/05/17/paper-note-fast-rcnn/
- notes: http://blog.csdn.net/linj_m/article/details/48930179
- github(“Fast R-CNN in MXNet”): https://github.com/precedenceguo/mx-rcnn
- github: https://github.com/mahyarnajibi/fast-rcnn-torch
- github: https://github.com/apple2373/chainer-simple-fast-rnn
- github(Tensorflow): https://github.com/zplizzi/tensorflow-fast-rcnn
DeepBox: Learning Objectness with Convolutional Networks
- arxiv: http://arxiv.org/abs/1505.02146
- github: https://github.com/weichengkuo/DeepBox
Object detection via a multi-region & semantic segmentation-aware CNN model
- intro: ICCV 2015. MR-CNN
- arxiv: http://arxiv.org/abs/1505.01749
- github: https://github.com/gidariss/mrcnn-object-detection
- notes: http://zhangliliang.com/2015/05/17/paper-note-ms-cnn/
- notes: http://blog.cvmarcher.com/posts/2015/05/17/multi-region-semantic-segmentation-aware-cnn/
- my notes: Who can tell me why there are a bunch of duplicated sentences in section 7.2 “Detection error analysis”? :-D
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
- intro: NIPS 2015
- arxiv: http://arxiv.org/abs/1506.01497
- gitxiv: http://www.gitxiv.com/posts/8pfpcvefDYn2gSgXk/faster-r-cnn-towards-real-time-object-detection-with-region
- slides: http://web.cs.hacettepe.edu.tr/~aykut/classes/spring2016/bil722/slides/w05-FasterR-CNN.pdf
- github: https://github.com/ShaoqingRen/faster_rcnn
- github: https://github.com/rbgirshick/py-faster-rcnn
- github: https://github.com/mitmul/chainer-faster-rcnn
- github(Torch): https://github.com/andreaskoepf/faster-rcnn.torch
- github(Torch): https://github.com/ruotianluo/Faster-RCNN-Densecap-torch
- github(Tensorflow): https://github.com/smallcorgi/Faster-RCNN_TF
- github(tensorflow): https://github.com/CharlesShang/TFFRCNN
Faster R-CNN in MXNet with distributed implementation and data parallelization
- github: https://github.com/dmlc/mxnet/tree/master/example/rcnn
You Only Look Once: Unified, Real-Time Object Detection
- intro: YOLO uses the whole topmost feature map to predict both confidences for multiple categories and bounding boxes (which are shared for these categories).
- arxiv: http://arxiv.org/abs/1506.02640
- code: http://pjreddie.com/darknet/yolo/
- github: https://github.com/pjreddie/darknet
- reddit:https://www.reddit.com/r/MachineLearning/comments/3a3m0o/realtime_object_detection_with_yolo/
- github: https://github.com/gliese581gg/YOLO_tensorflow
- github: https://github.com/xingwangsfu/caffe-yolo
- github: https://github.com/frankzhangrui/Darknet-Yolo
- github: https://github.com/BriSkyHekun/py-darknet-yolo
- github: https://github.com/tommy-qichang/yolo.torch
- github: https://github.com/frischzenger/yolo-windows
- gtihub: https://github.com/AlexeyAB/yolo-windows
Start Training YOLO with Our Own Data
- intro: train with customized data and class numbers/labels. Linux / Windows version for darknet.
- blog: http://guanghan.info/blog/en/my-works/train-yolo/
- github: https://github.com/Guanghan/darknet
R-CNN minus R
- arxiv: http://arxiv.org/abs/1506.06981
AttentionNet: Aggregating Weak Directions for Accurate Object Detection
- intro: ICCV 2015
- intro: state-of-the-art performance of 65% (AP) on PASCAL VOC 2007/2012 human detection task
- arxiv: http://arxiv.org/abs/1506.07704
- slides: https://www.robots.ox.ac.uk/~vgg/rg/slides/AttentionNet.pdf
- slides: http://image-net.org/challenges/talks/lunit-kaist-slide.pdf
DenseBox: Unifying Landmark Localization with End to End Object Detection
- arxiv: http://arxiv.org/abs/1509.04874
- demo: http://pan.baidu.com/s/1mgoWWsS
- KITTI result: http://www.cvlibs.net/datasets/kitti/eval_object.php
SSD: Single Shot MultiBox Detector
- arxiv: http://arxiv.org/abs/1512.02325
- paper: http://www.cs.unc.edu/~wliu/papers/ssd.pdf
- github: https://github.com/weiliu89/caffe/tree/ssd
- video: http://weibo.com/p/2304447a2326da963254c963c97fb05dd3a973
- github(MXNet): https://github.com/zhreshold/mxnet-ssd
- github: https://github.com/zhreshold/mxnet-ssd.cpp
- github(Keras): https://github.com/rykov8/ssd_keras
为什么SSD(Single Shot MultiBox Detector)对小目标的检测效果不好?
- zhihu: https://www.zhihu.com/question/49455386
Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks
- intro: “0.8s per image on a Titan X GPU (excluding proposal generation) without two-stage bounding-box regression and 1.15s per image with it”.
- arxiv: http://arxiv.org/abs/1512.04143
- slides: http://www.seanbell.ca/tmp/ion-coco-talk-bell2015.pdf
- coco-leaderboard: http://mscoco.org/dataset/#detections-leaderboard
Adaptive Object Detection Using Adjacency and Zoom Prediction
- intro: CVPR 2016. AZ-Net
- arxiv: http://arxiv.org/abs/1512.07711
- github: https://github.com/luyongxi/az-net
- youtube: https://www.youtube.com/watch?v=YmFtuNwxaNM
G-CNN: an Iterative Grid Based Object Detector
- arxiv: http://arxiv.org/abs/1512.07729
Factors in Finetuning Deep Model for object detection Factors in Finetuning Deep Model for Object Detection with Long-tail Distribution
- intro: CVPR 2016.rank 3rd for provided data and 2nd for external data on ILSVRC 2015 object detection
- project page:http://www.ee.cuhk.edu.hk/~wlouyang/projects/ImageNetFactors/CVPR16.html
- arxiv: http://arxiv.org/abs/1601.05150
We don’t need no bounding-boxes: Training object class detectors using only human verification
- arxiv: http://arxiv.org/abs/1602.08405
HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection
- arxiv: http://arxiv.org/abs/1604.00600
A MultiPath Network for Object Detection
- intro: BMVC 2016. Facebook AI Research (FAIR)
- arxiv: http://arxiv.org/abs/1604.02135
- github: https://github.com/facebookresearch/multipathnet
CRAFT Objects from Images
- intro: CVPR 2016. Cascade Region-proposal-network And FasT-rcnn. an extension of Faster R-CNN
- project page: http://byangderek.github.io/projects/craft.html
- arxiv: https://arxiv.org/abs/1604.03239
- paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Yang_CRAFT_Objects_From_CVPR_2016_paper.pdf
- github: https://github.com/byangderek/CRAFT
Training Region-based Object Detectors with Online Hard Example Mining
- intro: CVPR 2016 Oral. Online hard example mining (OHEM)
- arxiv: http://arxiv.org/abs/1604.03540
- paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Shrivastava_Training_Region-Based_Object_CVPR_2016_paper.pdf
Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection
- intro: CVPR 2016
- arxiv: http://arxiv.org/abs/1604.05766
Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers
http://www-personal.umich.edu/~wgchoi/SDP-CRC_camready.pdf
R-FCN: Object Detection via Region-based Fully Convolutional Networks
- arxiv: http://arxiv.org/abs/1605.06409
- github: https://github.com/daijifeng001/R-FCN
- github: https://github.com/Orpine/py-R-FCN
Weakly supervised object detection using pseudo-strong labels
- arxiv: http://arxiv.org/abs/1607.04731
Recycle deep features for better object detection
- arxiv: http://arxiv.org/abs/1607.05066
A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection
- intro: ECCV 2016
- intro: 640×480: 15 fps, 960×720: 8 fps
- arxiv: http://arxiv.org/abs/1607.07155
- github: https://github.com/zhaoweicai/mscnn
- poster: http://www.eccv2016.org/files/posters/P-2B-38.pdf
Multi-stage Object Detection with Group Recursive Learning
- intro: VOC2007: 78.6%, VOC2012: 74.9%
- arxiv: http://arxiv.org/abs/1608.05159
Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection
- intro: SubCNN
- arxiv: http://arxiv.org/abs/1604.04693
- github: https://github.com/yuxng/SubCNN
PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection
- intro: “less channels with more layers”, concatenated ReLU, Inception, and HyperNet, batch normalization, residual connections
- arxiv: http://arxiv.org/abs/1608.08021
- github: https://github.com/sanghoon/pva-faster-rcnn
- leaderboard(PVANet 9.0): http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4
PVANet: Lightweight Deep Neural Networks for Real-time Object Detection
- intro: Presented at NIPS 2016 Workshop on Efficient Methods for Deep Neural Networks (EMDNN). Continuation of arXiv:1608.08021
- arxiv: https://arxiv.org/abs/1611.08588
Gated Bi-directional CNN for Object Detection
- intro: The Chinese University of Hong Kong & Sensetime Group Limited
- paper: http://link.springer.com/chapter/10.1007/978-3-319-46478-7_22
- mirror: https://pan.baidu.com/s/1dFohO7v
Crafting GBD-Net for Object Detection
- intro: winner of the ImageNet object detection challenge of 2016. CUImage and CUVideo
- intro: gated bi-directional CNN (GBD-Net)
- arxiv: https://arxiv.org/abs/1610.02579
- github: https://github.com/craftGBD/craftGBD
StuffNet: Using ‘Stuff’ to Improve Object Detection
- arxiv: https://arxiv.org/abs/1610.05861
Generalized Haar Filter based Deep Networks for Real-Time Object Detection in Traffic Scene
- arxiv: https://arxiv.org/abs/1610.09609
Hierarchical Object Detection with Deep Reinforcement Learning
- intro: Deep Reinforcement Learning Workshop (NIPS 2016)
- project page: https://imatge-upc.github.io/detection-2016-nipsws/
- arxiv: https://arxiv.org/abs/1611.03718
- github: https://github.com/imatge-upc/detection-2016-nipsws
Learning to detect and localize many objects from few examples
- arxiv: https://arxiv.org/abs/1611.05664
Speed/accuracy trade-offs for modern convolutional object detectors
- intro: Google Research
- arxiv: https://arxiv.org/abs/1611.10012
SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving
- arxiv: https://arxiv.org/abs/1612.01051
Feature Pyramid Networks for Object Detection
- intro: Facebook AI Research
- arxiv: https://arxiv.org/abs/1612.03144
Learning Object Class Detectors from Weakly Annotated Video
- intro: CVPR 2012
- paper:https://www.vision.ee.ethz.ch/publications/papers/proceedings/eth_biwi_00905.pdf
Analysing domain shift factors between videos and images for object detection
- arxiv: https://arxiv.org/abs/1501.01186
Video Object Recognition
- slides:http://vision.princeton.edu/courses/COS598/2015sp/slides/VideoRecog/Video%20Object%20Recognition.pptx
Deep Learning for Saliency Prediction in Natural Video
- intro: Submitted on 12 Jan 2016
- keywords: Deep learning, saliency map, optical flow, convolution network, contrast features
- paper: https://hal.archives-ouvertes.fr/hal-01251614/document
T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos
- intro: Winning solution in ILSVRC2015 Object Detection from Video(VID) Task
- arxiv: http://arxiv.org/abs/1604.02532
- github: https://github.com/myfavouritekk/T-CNN
Object Detection from Video Tubelets with Convolutional Neural Networks
- intro: CVPR 2016 Spotlight paper
- arxiv: https://arxiv.org/abs/1604.04053
- paper: http://www.ee.cuhk.edu.hk/~wlouyang/Papers/KangVideoDet_CVPR16.pdf
- gihtub: https://github.com/myfavouritekk/vdetlib
Object Detection in Videos with Tubelets and Multi-context Cues
- intro: SenseTime Group
- slides: http://www.ee.cuhk.edu.hk/~xgwang/CUvideo.pdf
- slides: http://image-net.org/challenges/talks/Object%20Detection%20in%20Videos%20with%20Tubelets%20and%20Multi-context%20Cues%20-%20Final.pdf
Context Matters: Refining Object Detection in Video with Recurrent Neural Networks
- intro: BMVC 2016
- keywords: pseudo-labeler
- arxiv: http://arxiv.org/abs/1607.04648
- paper: http://vision.cornell.edu/se3/wp-content/uploads/2016/07/video_object_detection_BMVC.pdf
CNN Based Object Detection in Large Video Images
- intro: WangTao @ 爱奇艺
- keywords: object retrieval, object detection, scene classification
- slides: http://on-demand.gputechconf.com/gtc/2016/presentation/s6362-wang-tao-cnn-based-object-detection-large-video-images.pdf
YouTube-Objects dataset v2.2
- homepage: http://calvin.inf.ed.ac.uk/datasets/youtube-objects-dataset/
ILSVRC2015: Object detection from video (VID)
- homepage: http://vision.cs.unc.edu/ilsvrc2015/download-videos-3j16.php#vid
Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks
- arxiv: https://arxiv.org/abs/1609.06666
This task involves predicting the salient regions of an image given by human eye fixations.
Best Deep Saliency Detection Models (CVPR 2016 & 2015)
http://i.cs.hku.hk/~yzyu/vision.html
Large-scale optimization of hierarchical features for saliency prediction in natural images
- paper: http://coxlab.org/pdfs/cvpr2014_vig_saliency.pdf
Predicting Eye Fixations using Convolutional Neural Networks
- paper: http://www.escience.cn/system/file?fileId=72648
Saliency Detection by Multi-Context Deep Learning
- paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Zhao_Saliency_Detection_by_2015_CVPR_paper.pdf
DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection
- arxiv: http://arxiv.org/abs/1510.05484
SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection
- paper: www.shengfenghe.com/supercnn-a-superpixelwise-convolutional-neural-network-for-salient-object-detection.html
Shallow and Deep Convolutional Networks for Saliency Prediction
- arxiv: http://arxiv.org/abs/1603.00845
- github: https://github.com/imatge-upc/saliency-2016-cvpr
Recurrent Attentional Networks for Saliency Detection
- intro: CVPR 2016. recurrent attentional convolutional-deconvolution network (RACDNN)
- arxiv: http://arxiv.org/abs/1604.03227
Two-Stream Convolutional Networks for Dynamic Saliency Prediction
- arxiv: http://arxiv.org/abs/1607.04730
Unconstrained Salient Object Detection
Unconstrained Salient Object Detection via Proposal Subset Optimization
- intro: CVPR 2016
- project page: http://cs-people.bu.edu/jmzhang/sod.html
- paper: http://cs-people.bu.edu/jmzhang/SOD/CVPR16SOD_camera_ready.pdf
- github: https://github.com/jimmie33/SOD
- caffe model zoo: https://github.com/BVLC/caffe/wiki/Model-Zoo#cnn-object-proposal-models-for-salient-object-detection
DHSNet: Deep Hierarchical Saliency Network for Salient Object Detection
- paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Liu_DHSNet_Deep_Hierarchical_CVPR_2016_paper.pdf
Salient Object Subitizing
- intro: CVPR 2015
- intro: predicting the existence and the number of salient objects in an image using holistic cues
- project page: http://cs-people.bu.edu/jmzhang/sos.html
- arxiv: http://arxiv.org/abs/1607.07525
- paper: http://cs-people.bu.edu/jmzhang/SOS/SOS_preprint.pdf
- caffe model zoo: https://github.com/BVLC/caffe/wiki/Model-Zoo#cnn-models-for-salient-object-subitizing
Deeply-Supervised Recurrent Convolutional Neural Network for Saliency Detection
- intro: ACMMM 2016. deeply-supervised recurrent convolutional neural network (DSRCNN)
- arxiv: http://arxiv.org/abs/1608.05177
Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs
- intro: ECCV 2016
- arxiv: http://arxiv.org/abs/1608.05186
Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection
- arxiv: http://arxiv.org/abs/1608.08029
A Deep Multi-Level Network for Saliency Prediction
- arxiv: http://arxiv.org/abs/1609.01064
Visual Saliency Detection Based on Multiscale Deep CNN Features
- intro: IEEE Transactions on Image Processing
- arxiv: http://arxiv.org/abs/1609.02077
A Deep Spatial Contextual Long-term Recurrent Convolutional Network for Saliency Detection
- intro: DSCLRCN
- arxiv: https://arxiv.org/abs/1610.01708
Deeply supervised salient object detection with short connections
- arxiv: https://arxiv.org/abs/1611.04849
Weakly Supervised Top-down Salient Object Detection
- intro: Nanyang Technological University
- arxiv: https://arxiv.org/abs/1611.05345
Multi-view Face Detection Using Deep Convolutional Neural Networks
- intro: Yahoo
- arxiv: http://arxiv.org/abs/1502.02766
From Facial Parts Responses to Face Detection: A Deep Learning Approach
- project page: http://personal.ie.cuhk.edu.hk/~ys014/projects/Faceness/Faceness.html
Compact Convolutional Neural Network Cascade for Face Detection
- arxiv: http://arxiv.org/abs/1508.01292
- github: https://github.com/Bkmz21/FD-Evaluation
Face Detection with End-to-End Integration of a ConvNet and a 3D Model
- intro: ECCV 2016
- arxiv: https://arxiv.org/abs/1606.00850
- github(MXNet): https://github.com/tfwu/FaceDetection-ConvNet-3D
Supervised Transformer Network for Efficient Face Detection
- arxiv: http://arxiv.org/abs/1607.05477
UnitBox: An Advanced Object Detection Network
- intro: ACM MM 2016
- arxiv: http://arxiv.org/abs/1608.01471
Bootstrapping Face Detection with Hard Negative Examples
- author: 万韶华 @ 小米.
- intro: Faster R-CNN, hard negative mining. state-of-the-art on the FDDB dataset
- arxiv: http://arxiv.org/abs/1608.02236
Grid Loss: Detecting Occluded Faces
- intro: ECCV 2016
- arxiv: https://arxiv.org/abs/1609.00129
- paper: http://lrs.icg.tugraz.at/pubs/opitz_eccv_16.pdf
- poster: http://www.eccv2016.org/files/posters/P-2A-34.pdf
A Multi-Scale Cascade Fully Convolutional Network Face Detector
- intro: ICPR 2016
- arxiv: http://arxiv.org/abs/1609.03536
Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks
Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks
- project page: https://kpzhang93.github.io/MTCNN_face_detection_alignment/index.html
- arxiv: https://arxiv.org/abs/1604.02878
- github(Matlab): https://github.com/kpzhang93/MTCNN_face_detection_alignment
- github(MXNet): https://github.com/pangyupo/mxnet_mtcnn_face_detection
- github: https://github.com/DaFuCoding/MTCNN_Caffe
FDDB: Face Detection Data Set and Benchmark
- homepage: http://vis-www.cs.umass.edu/fddb/index.html
- results: http://vis-www.cs.umass.edu/fddb/results.html
WIDER FACE: A Face Detection Benchmark
- homepage: http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/
- arxiv: http://arxiv.org/abs/1511.06523
Facial Point / Landmark Detection
Deep Convolutional Network Cascade for Facial Point Detection
- homepage: http://mmlab.ie.cuhk.edu.hk/archive/CNN_FacePoint.htm
- paper: http://www.ee.cuhk.edu.hk/~xgwang/papers/sunWTcvpr13.pdf
- github: https://github.com/luoyetx/deep-landmark
A Recurrent Encoder-Decoder Network for Sequential Face Alignment
- intro: ECCV 2016
- arxiv: https://arxiv.org/abs/1608.05477
Detecting facial landmarks in the video based on a hybrid framework
- arxiv: http://arxiv.org/abs/1609.06441
Deep Constrained Local Models for Facial Landmark Detection
- arxiv: https://arxiv.org/abs/1611.08657
End-to-end people detection in crowded scenes
- arxiv: http://arxiv.org/abs/1506.04878
- github: https://github.com/Russell91/reinspect
- ipn:http://nbviewer.ipython.org/github/Russell91/ReInspect/blob/master/evaluation_reinspect.ipynb
Detecting People in Artwork with CNNs
- intro: ECCV 2016 Workshops
- arxiv: https://arxiv.org/abs/1610.08871
Context-aware CNNs for person head detection
- arxiv: http://arxiv.org/abs/1511.07917
- github: https://github.com/aosokin/cnn_head_detection
Pedestrian Detection aided by Deep Learning Semantic Tasks
- intro: CVPR 2015
- project page: http://mmlab.ie.cuhk.edu.hk/projects/TA-CNN/
- paper: http://arxiv.org/abs/1412.0069
Deep Learning Strong Parts for Pedestrian Detection
- intro: ICCV 2015. CUHK. DeepParts
- intro: Achieving 11.89% average miss rate on Caltech Pedestrian Dataset
- paper: http://personal.ie.cuhk.edu.hk/~pluo/pdf/tianLWTiccv15.pdf
Deep convolutional neural networks for pedestrian detection
- arxiv: http://arxiv.org/abs/1510.03608
- github: https://github.com/DenisTome/DeepPed
New algorithm improves speed and accuracy of pedestrian detection
- blog: http://www.eurekalert.org/pub_releases/2016-02/uoc–nai020516.php
Pushing the Limits of Deep CNNs for Pedestrian Detection
- intro: “set a new record on the Caltech pedestrian dataset, lowering the log-average miss rate from 11.7% to 8.9%”
- arxiv: http://arxiv.org/abs/1603.04525
A Real-Time Deep Learning Pedestrian Detector for Robot Navigation
- arxiv: http://arxiv.org/abs/1607.04436
A Real-Time Pedestrian Detector using Deep Learning for Human-Aware Navigation
- arxiv: http://arxiv.org/abs/1607.04441
Is Faster R-CNN Doing Well for Pedestrian Detection?
- arxiv: http://arxiv.org/abs/1607.07032
- github: https://github.com/zhangliliang/RPN_BF/tree/RPN-pedestrian
Reduced Memory Region Based Deep Convolutional Neural Network Detection
- intro: IEEE 2016 ICCE-Berlin
- arxiv: http://arxiv.org/abs/1609.02500
Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection
- arxiv: https://arxiv.org/abs/1610.03466
Multispectral Deep Neural Networks for Pedestrian Detection
- intro: BMVC 2016 oral
- arxiv: https://arxiv.org/abs/1611.02644
DAVE: A Unified Framework for Fast Vehicle Detection and Annotation
- intro: ECCV 2016
- arxiv: http://arxiv.org/abs/1607.04564
Traffic-Sign Detection and Classification in the Wild
- project page(code+dataset): http://cg.cs.tsinghua.edu.cn/traffic-sign/
- paper: http://120.52.73.11/www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Zhu_Traffic-Sign_Detection_and_CVPR_2016_paper.pdf
- code & model: http://cg.cs.tsinghua.edu.cn/traffic-sign/data_model_code/newdata0411.zip
Holistically-Nested Edge Detection
- intro: ICCV 2015, Marr Prize
- paper: http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Xie_Holistically-Nested_Edge_Detection_ICCV_2015_paper.pdf
- arxiv: http://arxiv.org/abs/1504.06375
- github: https://github.com/s9xie/hed
Unsupervised Learning of Edges
- intro: CVPR 2016. Facebook AI Research
- arxiv: http://arxiv.org/abs/1511.04166
- zn-blog: http://www.leiphone.com/news/201607/b1trsg9j6GSMnjOP.html
Pushing the Boundaries of Boundary Detection using Deep Learning
- arxiv: http://arxiv.org/abs/1511.07386
Convolutional Oriented Boundaries
- intro: ECCV 2016
- arxiv: http://arxiv.org/abs/1608.02755
Richer Convolutional Features for Edge Detection
- intro: richer convolutional features (RCF)
- arxiv: https://arxiv.org/abs/1612.02103
Object Skeleton Extraction in Natural Images by Fusing Scale-associated Deep Side Outputs
- arxiv: http://arxiv.org/abs/1603.09446
- github: https://github.com/zeakey/DeepSkeleton
DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images
- arxiv: http://arxiv.org/abs/1609.03659
Deep Fruit Detection in Orchards
- arxiv: https://arxiv.org/abs/1610.03677
Image Segmentation for Fruit Detection and Yield Estimation in Apple Orchards
- intro: The Journal of Field Robotics in May 2016
- project page: http://confluence.acfr.usyd.edu.au/display/AGPub/
- arxiv: https://arxiv.org/abs/1610.08120
Deep Deformation Network for Object Landmark Localization
- arxiv: http://arxiv.org/abs/1605.01014
Fashion Landmark Detection in the Wild
- arxiv: http://arxiv.org/abs/1608.03049
Deep Learning for Fast and Accurate Fashion Item Detection
- intro: Kuznech Inc.
- intro: MultiBox and Fast R-CNN
- paper:https://kddfashion2016.mybluemix.net/kddfashion_finalSubmissions/Deep%20Learning%20for%20Fast%20and%20Accurate%20Fashion%20Item%20Detection.pdf
Visual Relationship Detection with Language Priors
- intro: ECCV 2016 oral
- paper: https://cs.stanford.edu/people/ranjaykrishna/vrd/vrd.pdf
- github: https://github.com/Prof-Lu-Cewu/Visual-Relationship-Detection
OSMDeepOD - OSM and Deep Learning based Object Detection from Aerial Imagery (formerly known as “OSM-Crosswalk-Detection”)
- github: https://github.com/geometalab/OSMDeepOD
Selfie Detection by Synergy-Constraint Based Convolutional Neural Network
- intro: IEEE SITIS 2016
- arxiv: https://arxiv.org/abs/1611.04357
Associative Embedding:End-to-End Learning for Joint Detection and Grouping
- arxiv: https://arxiv.org/abs/1611.05424
Deep Cuboid Detection: Beyond 2D Bounding Boxes
- intro: CMU & Magic Leap
- arxiv: https://arxiv.org/abs/1611.10010
DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers
- arxiv: http://arxiv.org/abs/1510.04445
- github: https://github.com/aghodrati/deepproposal
Scale-aware Pixel-wise Object Proposal Networks
- intro: IEEE Transactions on Image Processing
- arxiv: http://arxiv.org/abs/1601.04798
Attend Refine Repeat: Active Box Proposal Generation via In-Out Localization
- intro: AttractioNet
- arxiv: https://arxiv.org/abs/1606.04446
- github: https://github.com/gidariss/AttractioNet
Learning to Segment Object Proposals via Recursive Neural Networks
- arxiv: https://arxiv.org/abs/1612.01057
Beyond Bounding Boxes: Precise Localization of Objects in Images
- intro: PhD Thesis
- homepage: http://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-193.html
- phd-thesis: http://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-193.pdf
- github(“SDS using hypercolumns”): https://github.com/bharath272/sds
Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning
- arxiv: http://arxiv.org/abs/1503.00949
Weakly Supervised Object Localization Using Size Estimates
- arxiv: http://arxiv.org/abs/1608.04314
Localizing objects using referring expressions
- intro: ECCV 2016
- keywords: LSTM, multiple instance learning (MIL)
- paper: http://www.umiacs.umd.edu/~varun/files/refexp-ECCV16.pdf
- github: https://github.com/varun-nagaraja/referring-expressions
LocNet: Improving Localization Accuracy for Object Detection
- arxiv: http://arxiv.org/abs/1511.07763
- github: https://github.com/gidariss/LocNet
Learning Deep Features for Discriminative Localization
- homepage: http://cnnlocalization.csail.mit.edu/
- arxiv: http://arxiv.org/abs/1512.04150
- github(Tensorflow): https://github.com/jazzsaxmafia/Weakly_detector
- github: https://github.com/metalbubble/CAM
- github: https://github.com/tdeboissiere/VGG16CAM-keras
ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization
- intro: ECCV 2016
- project page: http://www.di.ens.fr/willow/research/contextlocnet/
- arxiv: http://arxiv.org/abs/1609.04331
- github: https://github.com/vadimkantorov/contextlocnet
Convolutional Feature Maps: Elements of efficient (and accurate) CNN-based object detection
- slides: http://research.microsoft.com/en-us/um/people/kahe/iccv15tutorial/iccv2015_tutorial_convolutional_feature_maps_kaiminghe.pdf
TensorBox: a simple framework for training neural networks to detect objects in images
- intro: “The basic model implements the simple and robust GoogLeNet-OverFeat algorithm. We additionally provide an implementation of the ReInspect algorithm”
- github: https://github.com/Russell91/TensorBox
Object detection in torch: Implementation of some object detection frameworks in torch
- github: https://github.com/fmassa/object-detection.torch
Using DIGITS to train an Object Detection network
- github: https://github.com/NVIDIA/DIGITS/blob/master/examples/object-detection/README.md
FCN-MultiBox Detector
- intro: Full convolution MultiBox Detector ( like SSD) implemented in Torch.
- github: https://github.com/teaonly/FMD.torch
Convolutional Neural Networks for Object Detection
http://rnd.azoft.com/convolutional-neural-networks-object-detection/
Introducing automatic object detection to visual search (Pinterest)
- keywords: Faster R-CNN
- blog: https://engineering.pinterest.com/blog/introducing-automatic-object-detection-visual-search
- demo:https://engineering.pinterest.com/sites/engineering/files/Visual%20Search%20V1%20-%20Video.mp4
- review: https://news.developer.nvidia.com/pinterest-introduces-the-future-of-visual-search/?mkt_tok=eyJpIjoiTnpaa01UWXpPRE0xTURFMiIsInQiOiJJRjcybjkwTmtmallORUhLOFFFODBDclFqUlB3SWlRVXJXb1MrQ013TDRIMGxLQWlBczFIeWg0TFRUdnN2UHY2ZWFiXC9QQVwvQzBHM3B0UzBZblpOSmUyU1FcLzNPWXI4cml2VERwTTJsOFwvOEk9In0%3D
Deep Learning for Object Detection with DIGITS
- blog: https://devblogs.nvidia.com/parallelforall/deep-learning-object-detection-digits/
Analyzing The Papers Behind Facebook’s Computer Vision Approach
- keywords: DeepMask, SharpMask, MultiPathNet
- blog: https://adeshpande3.github.io/adeshpande3.github.io/Analyzing-the-Papers-Behind-Facebook’s-Computer-Vision-Approach/
**Easily Create High Quality Object Detectors with Deep Learning **
- intro: dlib v19.2
- blog: http://blog.dlib.net/2016/10/easily-create-high-quality-object.html
How to Train a Deep-Learned Object Detection Model in the Microsoft Cognitive Toolkit
- blog: https://blogs.technet.microsoft.com/machinelearning/2016/10/25/how-to-train-a-deep-learned-object-detection-model-in-cntk/
- github:https://github.com/Microsoft/CNTK/tree/master/Examples/Image/Detection/FastRCNN
Object Detection in Satellite Imagery, a Low Overhead Approach
- part 1: https://medium.com/the-downlinq/object-detection-in-satellite-imagery-a-low-overhead-approach-part-i-cbd96154a1b7#.2csh4iwx9
- part 2: https://medium.com/the-downlinq/object-detection-in-satellite-imagery-a-low-overhead-approach-part-ii-893f40122f92#.f9b7dgf64
You Only Look Twice — Multi-Scale Object Detection in Satellite Imagery With Convolutional Neural Networks
- part 1: https://medium.com/the-downlinq/you-only-look-twice-multi-scale-object-detection-in-satellite-imagery-with-convolutional-neural-38dad1cf7571#.fmmi2o3of
- part 2: https://medium.com/the-downlinq/you-only-look-twice-multi-scale-object-detection-in-satellite-imagery-with-convolutional-neural-34f72f659588#.nwzarsz1t
Faster R-CNN Pedestrian and Car Detection
- blog: https://bigsnarf.wordpress.com/2016/11/07/faster-r-cnn-pedestrian-and-car-detection/
- ipn: https://gist.github.com/bigsnarfdude/2f7b2144065f6056892a98495644d3e0#file-demo_faster_rcnn_notebook-ipynb
- github: https://github.com/bigsnarfdude/Faster-RCNN_TF