报任安者

深度学习目标检测最全最新的方法paper和代码

Jump to...

Leaderboard
Papers
1. R-CNN
2. Fast R-CNN
3. Faster R-CNN
4. MultiBox
5. SPP-Net
6. DeepID-Net
7. NoC
8. DeepBox
9. MR-CNN
10. YOLO
11. YOLOv2
12. AttentionNet
13. DenseBox
14. SSD
15. DSSD
16. Inside-Outside Net (ION)
17. G-CNN
18. HyperNet
19. MultiPathNet
20. CRAFT
21. OHEM
22. R-FCN
23. MS-CNN
24. PVANET
25. GBD-Net
26. StuffNet
27. Feature Pyramid Network (FPN)
28. CC-Net
29. DSOD
30. NMS
31. Weakly Supervised Object Detection
Detection From Video
1. T-CNN
Object Detection in 3D
Object Detection on RGB-D
Salient Object Detection
1. Saliency Detection in Video
Visual Relationship Detection
Specific Object Deteciton
1. Face Deteciton
  1. UnitBox
  2. MTCNN
2. Facial Point / Landmark Detection
People Detection
1. Person Head Detection
2. Pedestrian Detection
3. Vehicle Detection
4. Traffic-Sign Detection
5. Boundary / Edge / Contour Detection
6. Skeleton Detection
7. Fruit Detection
8. Part Detection
Object Proposal
Localization
Tutorials / Talks
Projects
Tools
Blogs

Method	VOC2007	VOC2010	VOC2012	ILSVRC 2013	MSCOCO 2015	Speed
OverFeat				24.3%
R-CNN (AlexNet)	58.5%	53.7%	53.3%	31.4%
R-CNN (VGG16)	66.0%
SPP_net(ZF-5)	54.2%(1-model), 60.9%(2-model)			31.84%(1-model), 35.11%(6-model)
DeepID-Net	64.1%			50.3%
NoC	73.3%		68.8%
Fast-RCNN (VGG16)	70.0%	68.8%	68.4%		19.7%(@[0.5-0.95]), 35.9%(@0.5)
MR-CNN	78.2%		73.9%
Faster-RCNN (VGG16)	78.8%		75.9%		21.9%(@[0.5-0.95]), 42.7%(@0.5)	198ms
Faster-RCNN (ResNet-101)	85.6%		83.8%		37.4%(@[0.5-0.95]), 59.0%(@0.5)
YOLO	63.4%		57.9%			45 fps
YOLO VGG-16	66.4%					21 fps
YOLOv2 544 × 544	78.6%		73.4%		21.6%(@[0.5-0.95]), 44.0%(@0.5)	40 fps
SSD300 (VGG16)	77.2%		75.8%		25.1%(@[0.5-0.95]), 43.1%(@0.5)	46 fps
SSD512 (VGG16)	79.8%		78.5%		28.8%(@[0.5-0.95]), 48.5%(@0.5)	19 fps
ION	79.2%		76.4%
CRAFT	75.7%		71.3%	48.5%
OHEM	78.9%		76.3%		25.5%(@[0.5-0.95]), 45.9%(@0.5)
R-FCN (ResNet-50)	77.4%					0.12sec(K40), 0.09sec(TitianX)
R-FCN (ResNet-101)	79.5%					0.17sec(K40), 0.12sec(TitianX)
R-FCN (ResNet-101),multi sc train	83.6%		82.0%		31.5%(@[0.5-0.95]), 53.2%(@0.5)
PVANet 9.0	89.8%		84.2%			750ms(CPU), 46ms(TitianX)

Leaderboard

Detection Results: VOC2012

intro: Competition “comp4” (train on additional data)
homepage: http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4

Papers

Deep Neural Networks for Object Detection

paper: http://papers.nips.cc/paper/5207-deep-neural-networks-for-object-detection.pdf

OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks

arxiv: http://arxiv.org/abs/1312.6229
github: https://github.com/sermanet/OverFeat
code: http://cilvr.nyu.edu/doku.php?id=software:overfeat:start

R-CNN

Rich feature hierarchies for accurate object detection and semantic segmentation

intro: R-CNN
arxiv: http://arxiv.org/abs/1311.2524
supp: http://people.eecs.berkeley.edu/~rbg/papers/r-cnn-cvpr-supp.pdf
slides: http://www.image-net.org/challenges/LSVRC/2013/slides/r-cnn-ilsvrc2013-workshop.pdf
slides: http://www.cs.berkeley.edu/~rbg/slides/rcnn-cvpr14-slides.pdf
github: https://github.com/rbgirshick/rcnn
notes: http://zhangliliang.com/2014/07/23/paper-note-rcnn/
caffe-pr(“Make R-CNN the Caffe detection example”):https://github.com/BVLC/caffe/pull/482

Fast R-CNN

arxiv: http://arxiv.org/abs/1504.08083
slides: http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf
github: https://github.com/rbgirshick/fast-rcnn
github(COCO-branch): https://github.com/rbgirshick/fast-rcnn/tree/coco
webcam demo: https://github.com/rbgirshick/fast-rcnn/pull/29
notes: http://zhangliliang.com/2015/05/17/paper-note-fast-rcnn/
notes: http://blog.csdn.net/linj_m/article/details/48930179
github(“Fast R-CNN in MXNet”): https://github.com/precedenceguo/mx-rcnn
github: https://github.com/mahyarnajibi/fast-rcnn-torch
github: https://github.com/apple2373/chainer-simple-fast-rnn
github: https://github.com/zplizzi/tensorflow-fast-rcnn

A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection

intro: CVPR 2017
arxiv: https://arxiv.org/abs/1704.03414
paper: http://abhinavsh.info/papers/pdfs/adversarial_object_detection.pdf
github(Caffe): https://github.com/xiaolonw/adversarial-frcnn

Faster R-CNN

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

intro: NIPS 2015
arxiv: http://arxiv.org/abs/1506.01497
gitxiv: http://www.gitxiv.com/posts/8pfpcvefDYn2gSgXk/faster-r-cnn-towards-real-time-object-detection-with-region
slides: http://web.cs.hacettepe.edu.tr/~aykut/classes/spring2016/bil722/slides/w05-FasterR-CNN.pdf
github(official, Matlab): https://github.com/ShaoqingRen/faster_rcnn
github: https://github.com/rbgirshick/py-faster-rcnn
github: https://github.com/mitmul/chainer-faster-rcnn
github: https://github.com/andreaskoepf/faster-rcnn.torch
github: https://github.com/ruotianluo/Faster-RCNN-Densecap-torch
github: https://github.com/smallcorgi/Faster-RCNN_TF
github: https://github.com/CharlesShang/TFFRCNN
github(C++ demo): https://github.com/YihangLou/FasterRCNN-Encapsulation-Cplusplus
github: https://github.com/yhenon/keras-frcnn

Faster R-CNN in MXNet with distributed implementation and data parallelization

github: https://github.com/dmlc/mxnet/tree/master/example/rcnn

Contextual Priming and Feedback for Faster R-CNN

intro: ECCV 2016. Carnegie Mellon University
paper: http://abhinavsh.info/context_priming_feedback.pdf
poster: http://www.eccv2016.org/files/posters/P-1A-20.pdf

An Implementation of Faster RCNN with Study for Region Sampling

intro: Technical Report, 3 pages. CMU
arxiv: https://arxiv.org/abs/1702.02138
github: https://github.com/endernewton/tf-faster-rcnn

MultiBox

Scalable Object Detection using Deep Neural Networks

intro: first MultiBox. Train a CNN to predict Region of Interest.
arxiv: http://arxiv.org/abs/1312.2249
github: https://github.com/google/multibox
blog: https://research.googleblog.com/2014/12/high-quality-object-detection-at-scale.html

Scalable, High-Quality Object Detection

intro: second MultiBox
arxiv: http://arxiv.org/abs/1412.1441
github: https://github.com/google/multibox

SPP-Net

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

intro: ECCV 2014 / TPAMI 2015
arxiv: http://arxiv.org/abs/1406.4729
github: https://github.com/ShaoqingRen/SPP_net
notes: http://zhangliliang.com/2014/09/13/paper-note-sppnet/

DeepID-Net

DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection

intro: PAMI 2016
intro: an extension of R-CNN. box pre-training, cascade on region proposals, deformation layers and context representations
project page:http://www.ee.cuhk.edu.hk/%CB%9Cwlouyang/projects/imagenetDeepId/index.html
arxiv: http://arxiv.org/abs/1412.5661

Object Detectors Emerge in Deep Scene CNNs

intro: ICLR 2015
arxiv: http://arxiv.org/abs/1412.6856
paper: https://www.robots.ox.ac.uk/~vgg/rg/papers/zhou_iclr15.pdf
paper: https://people.csail.mit.edu/khosla/papers/iclr2015_zhou.pdf
slides: http://places.csail.mit.edu/slide_iclr2015.pdf

segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection

intro: CVPR 2015
project(code+data): https://www.cs.toronto.edu/~yukun/segdeepm.html
arxiv: https://arxiv.org/abs/1502.04275
github: https://github.com/YknZhu/segDeepM

NoC

Object Detection Networks on Convolutional Feature Maps

intro: TPAMI 2015
arxiv: http://arxiv.org/abs/1504.06066

Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction

arxiv: http://arxiv.org/abs/1504.03293
slides: http://www.ytzhang.net/files/publications/2015-cvpr-det-slides.pdf
github: https://github.com/YutingZhang/fgs-obj

DeepBox

DeepBox: Learning Objectness with Convolutional Networks

arxiv: http://arxiv.org/abs/1505.02146
github: https://github.com/weichengkuo/DeepBox

MR-CNN

Object detection via a multi-region & semantic segmentation-aware CNN model

intro: ICCV 2015. MR-CNN
arxiv: http://arxiv.org/abs/1505.01749
github: https://github.com/gidariss/mrcnn-object-detection
notes: http://zhangliliang.com/2015/05/17/paper-note-ms-cnn/
notes: http://blog.cvmarcher.com/posts/2015/05/17/multi-region-semantic-segmentation-aware-cnn/

YOLO

You Only Look Once: Unified, Real-Time Object Detection

arxiv: http://arxiv.org/abs/1506.02640
code: http://pjreddie.com/darknet/yolo/
github: https://github.com/pjreddie/darknet
blog: https://pjreddie.com/publications/yolo/
slides:https://docs.google.com/presentation/d/1aeRvtKG21KHdD5lg6Hgyhx5rPq_ZOsGjG5rJ1HP7BbA/pub?start=false&loop=false&delayms=3000&slide=id.p
reddit:https://www.reddit.com/r/MachineLearning/comments/3a3m0o/realtime_object_detection_with_yolo/
github: https://github.com/gliese581gg/YOLO_tensorflow
github: https://github.com/xingwangsfu/caffe-yolo
github: https://github.com/frankzhangrui/Darknet-Yolo
github: https://github.com/BriSkyHekun/py-darknet-yolo
github: https://github.com/tommy-qichang/yolo.torch
github: https://github.com/frischzenger/yolo-windows
github: https://github.com/AlexeyAB/yolo-windows
github: https://github.com/nilboy/tensorflow-yolo

darkflow - translate darknet to tensorflow. Load trained weights, retrain/fine-tune them using tensorflow, export constant graph def to C++

blog: https://thtrieu.github.io/notes/yolo-tensorflow-graph-buffer-cpp
github: https://github.com/thtrieu/darkflow

Start Training YOLO with Our Own Data

intro: train with customized data and class numbers/labels. Linux / Windows version for darknet.
blog: http://guanghan.info/blog/en/my-works/train-yolo/
github: https://github.com/Guanghan/darknet

YOLO: Core ML versus MPSNNGraph

intro: Tiny YOLO for iOS implemented using CoreML but also using the new MPS graph API.
blog: http://machinethink.net/blog/yolo-coreml-versus-mps-graph/
github: https://github.com/hollance/YOLO-CoreML-MPSNNGraph

TensorFlow YOLO object detection on Android

intro: Real-time object detection on Android using the YOLO network with TensorFlow
github: https://github.com/natanielruiz/android-yolo

Computer Vision in iOS – Object Detection

blog: https://sriraghu.com/2017/07/12/computer-vision-in-ios-object-detection/
github:https://github.com/r4ghu/iOS-CoreML-Yolo

YOLOv2

YOLO9000: Better, Faster, Stronger

arxiv: https://arxiv.org/abs/1612.08242
code: http://pjreddie.com/yolo9000/
github(Chainer): https://github.com/leetenki/YOLOv2
github(Keras): https://github.com/allanzelener/YAD2K
github(PyTorch): https://github.com/longcw/yolo2-pytorch
github(Tensorflow): https://github.com/hizhangp/yolo_tensorflow
github(Windows): https://github.com/AlexeyAB/darknet
github: https://github.com/choasUp/caffe-yolo9000
github: https://github.com/philipperemy/yolo-9000

Yolo_mark: GUI for marking bounded boxes of objects in images for training Yolo v2

github: https://github.com/AlexeyAB/Yolo_mark

R-CNN minus R

arxiv: http://arxiv.org/abs/1506.06981

AttentionNet

AttentionNet: Aggregating Weak Directions for Accurate Object Detection

intro: ICCV 2015
intro: state-of-the-art performance of 65% (AP) on PASCAL VOC 2007/2012 human detection task
arxiv: http://arxiv.org/abs/1506.07704
slides: https://www.robots.ox.ac.uk/~vgg/rg/slides/AttentionNet.pdf
slides: http://image-net.org/challenges/talks/lunit-kaist-slide.pdf

DenseBox

DenseBox: Unifying Landmark Localization with End to End Object Detection

arxiv: http://arxiv.org/abs/1509.04874
demo: http://pan.baidu.com/s/1mgoWWsS
KITTI result: http://www.cvlibs.net/datasets/kitti/eval_object.php

SSD

SSD: Single Shot MultiBox Detector

intro: ECCV 2016 Oral
arxiv: http://arxiv.org/abs/1512.02325
paper: http://www.cs.unc.edu/~wliu/papers/ssd.pdf
slides: http://www.cs.unc.edu/%7Ewliu/papers/ssd_eccv2016_slide.pdf
github(Official): https://github.com/weiliu89/caffe/tree/ssd
video: http://weibo.com/p/2304447a2326da963254c963c97fb05dd3a973
github: https://github.com/zhreshold/mxnet-ssd
github: https://github.com/zhreshold/mxnet-ssd.cpp
github: https://github.com/rykov8/ssd_keras
github: https://github.com/balancap/SSD-Tensorflow
github: https://github.com/amdegroot/ssd.pytorch
github(Caffe): https://github.com/chuanqi305/MobileNet-SSD

What’s the diffience in performance between this new code you pushed and the previous code? #327

https://github.com/weiliu89/caffe/issues/327

Enhancement of SSD by concatenating feature maps for object detection

intro: rainbow SSD (R-SSD)
arxiv: https://arxiv.org/abs/1705.09587

DSSD

DSSD : Deconvolutional Single Shot Detector

intro: UNC Chapel Hill & Amazon Inc
arxiv: https://arxiv.org/abs/1701.06659
demo: http://120.52.72.53/www.cs.unc.edu/c3pr90ntc0td/~cyfu/dssd_lalaland.mp4

Context-aware Single-Shot Detector

keywords: CSSD, DiCSSD, DeCSSD, effective receptive fields (ERFs), theoretical receptive fields (TRFs)
arxiv: https://arxiv.org/abs/1707.08682

Feature-Fused SSD: Fast Detection for Small Objects

https://arxiv.org/abs/1709.05054

Inside-Outside Net (ION)

Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks

intro: “0.8s per image on a Titan X GPU (excluding proposal generation) without two-stage bounding-box regression and 1.15s per image with it”.
arxiv: http://arxiv.org/abs/1512.04143
slides: http://www.seanbell.ca/tmp/ion-coco-talk-bell2015.pdf
coco-leaderboard: http://mscoco.org/dataset/#detections-leaderboard

Adaptive Object Detection Using Adjacency and Zoom Prediction

intro: CVPR 2016. AZ-Net
arxiv: http://arxiv.org/abs/1512.07711
github: https://github.com/luyongxi/az-net
youtube: https://www.youtube.com/watch?v=YmFtuNwxaNM

G-CNN

G-CNN: an Iterative Grid Based Object Detector

arxiv: http://arxiv.org/abs/1512.07729

Factors in Finetuning Deep Model for object detection

Factors in Finetuning Deep Model for Object Detection with Long-tail Distribution

intro: CVPR 2016.rank 3rd for provided data and 2nd for external data on ILSVRC 2015 object detection
project page:http://www.ee.cuhk.edu.hk/~wlouyang/projects/ImageNetFactors/CVPR16.html
arxiv: http://arxiv.org/abs/1601.05150

We don’t need no bounding-boxes: Training object class detectors using only human verification

arxiv: http://arxiv.org/abs/1602.08405

HyperNet

HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection

arxiv: http://arxiv.org/abs/1604.00600

MultiPathNet

A MultiPath Network for Object Detection

intro: BMVC 2016. Facebook AI Research (FAIR)
arxiv: http://arxiv.org/abs/1604.02135
github: https://github.com/facebookresearch/multipathnet

CRAFT

CRAFT Objects from Images

intro: CVPR 2016. Cascade Region-proposal-network And FasT-rcnn. an extension of Faster R-CNN
project page: http://byangderek.github.io/projects/craft.html
arxiv: https://arxiv.org/abs/1604.03239
paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Yang_CRAFT_Objects_From_CVPR_2016_paper.pdf
github: https://github.com/byangderek/CRAFT

OHEM

Training Region-based Object Detectors with Online Hard Example Mining

intro: CVPR 2016 Oral. Online hard example mining (OHEM)
arxiv: http://arxiv.org/abs/1604.03540
paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Shrivastava_Training_Region-Based_Object_CVPR_2016_paper.pdf
github(Official): https://github.com/abhi2610/ohem
author page: http://abhinav-shrivastava.info/

Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers

intro: CVPR 2016
keywords: scale-dependent pooling (SDP), cascaded rejection classifiers (CRC)
paper: http://www-personal.umich.edu/~wgchoi/SDP-CRC_camready.pdf

R-FCN

R-FCN: Object Detection via Region-based Fully Convolutional Networks

arxiv: http://arxiv.org/abs/1605.06409
github: https://github.com/daijifeng001/R-FCN
github: https://github.com/Orpine/py-R-FCN
github: https://github.com/PureDiors/pytorch_RFCN
github: https://github.com/bharatsingh430/py-R-FCN-multiGPU
github: https://github.com/xdever/RFCN-tensorflow

Recycle deep features for better object detection

arxiv: http://arxiv.org/abs/1607.05066

MS-CNN

A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection

intro: ECCV 2016
intro: 640×480: 15 fps, 960×720: 8 fps
arxiv: http://arxiv.org/abs/1607.07155
github: https://github.com/zhaoweicai/mscnn
poster: http://www.eccv2016.org/files/posters/P-2B-38.pdf

Multi-stage Object Detection with Group Recursive Learning

intro: VOC2007: 78.6%, VOC2012: 74.9%
arxiv: http://arxiv.org/abs/1608.05159

Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection

intro: WACV 2017. SubCNN
arxiv: http://arxiv.org/abs/1604.04693
github: https://github.com/tanshen/SubCNN

PVANET

PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection

intro: “less channels with more layers”, concatenated ReLU, Inception, and HyperNet, batch normalization, residual connections
arxiv: http://arxiv.org/abs/1608.08021
github: https://github.com/sanghoon/pva-faster-rcnn
leaderboard(PVANet 9.0): http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4

PVANet: Lightweight Deep Neural Networks for Real-time Object Detection

intro: Presented at NIPS 2016 Workshop on Efficient Methods for Deep Neural Networks (EMDNN). Continuation of arXiv:1608.08021
arxiv: https://arxiv.org/abs/1611.08588

GBD-Net

Gated Bi-directional CNN for Object Detection

intro: The Chinese University of Hong Kong & Sensetime Group Limited
paper: http://link.springer.com/chapter/10.1007/978-3-319-46478-7_22
mirror: https://pan.baidu.com/s/1dFohO7v

Crafting GBD-Net for Object Detection

intro: winner of the ImageNet object detection challenge of 2016. CUImage and CUVideo
intro: gated bi-directional CNN (GBD-Net)
arxiv: https://arxiv.org/abs/1610.02579
github: https://github.com/craftGBD/craftGBD

StuffNet

StuffNet: Using ‘Stuff’ to Improve Object Detection

arxiv: https://arxiv.org/abs/1610.05861

Generalized Haar Filter based Deep Networks for Real-Time Object Detection in Traffic Scene

arxiv: https://arxiv.org/abs/1610.09609

Hierarchical Object Detection with Deep Reinforcement Learning

intro: Deep Reinforcement Learning Workshop (NIPS 2016)
project page: https://imatge-upc.github.io/detection-2016-nipsws/
arxiv: https://arxiv.org/abs/1611.03718
slides: http://www.slideshare.net/xavigiro/hierarchical-object-detection-with-deep-reinforcement-learning
github: https://github.com/imatge-upc/detection-2016-nipsws
blog: http://jorditorres.org/nips/

Learning to detect and localize many objects from few examples

arxiv: https://arxiv.org/abs/1611.05664

Speed/accuracy trade-offs for modern convolutional object detectors

intro: Google Research
arxiv: https://arxiv.org/abs/1611.10012

SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving

arxiv: https://arxiv.org/abs/1612.01051
github: https://github.com/BichenWuUCB/squeezeDet
github: https://github.com/fregu856/2D_detection

Feature Pyramid Network (FPN)

Feature Pyramid Networks for Object Detection

intro: Facebook AI Research
arxiv: https://arxiv.org/abs/1612.03144

Action-Driven Object Detection with Top-Down Visual Attentions

arxiv: https://arxiv.org/abs/1612.06704

Beyond Skip Connections: Top-Down Modulation for Object Detection

intro: CMU & UC Berkeley & Google Research
arxiv: https://arxiv.org/abs/1612.06851

Wide-Residual-Inception Networks for Real-time Object Detection

intro: Inha University
arxiv: https://arxiv.org/abs/1702.01243

Attentional Network for Visual Object Detection

intro: University of Maryland & Mitsubishi Electric Research Laboratories
arxiv: https://arxiv.org/abs/1702.01478

CC-Net

Learning Chained Deep Features and Classifiers for Cascade in Object Detection

intro: chained cascade network (CC-Net). 81.1% mAP on PASCAL VOC 2007
arxiv: https://arxiv.org/abs/1702.07054

DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling

https://arxiv.org/abs/1703.10295

Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries

intro: CVPR 2017
arxiv: https://arxiv.org/abs/1704.03944

Spatial Memory for Context Reasoning in Object Detection

arxiv: https://arxiv.org/abs/1704.04224

Accurate Single Stage Detector Using Recurrent Rolling Convolution

intro: CVPR 2017. SenseTime
keywords: Recurrent Rolling Convolution (RRC)
arxiv: https://arxiv.org/abs/1704.05776
github: https://github.com/xiaohaoChen/rrc_detection

Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection

https://arxiv.org/abs/1704.05775

S-OHEM: Stratified Online Hard Example Mining for Object Detection

https://arxiv.org/abs/1705.02233

LCDet: Low-Complexity Fully-Convolutional Neural Networks for Object Detection in Embedded Systems

intro: Embedded Vision Workshop in CVPR. UC San Diego & Qualcomm Inc
arxiv: https://arxiv.org/abs/1705.05922

Point Linking Network for Object Detection

intro: Point Linking Network (PLN)
arxiv: https://arxiv.org/abs/1706.03646

Perceptual Generative Adversarial Networks for Small Object Detection

https://arxiv.org/abs/1706.05274

Few-shot Object Detection

https://arxiv.org/abs/1706.08249

Yes-Net: An effective Detector Based on Global Information

https://arxiv.org/abs/1706.09180

SMC Faster R-CNN: Toward a scene-specialized multi-object detector

https://arxiv.org/abs/1706.10217

Towards lightweight convolutional neural networks for object detection

https://arxiv.org/abs/1707.01395

RON: Reverse Connection with Objectness Prior Networks for Object Detection

intro: CVPR 2017
arxiv: https://arxiv.org/abs/1707.01691
github: https://github.com/taokong/RON

Residual Features and Unified Prediction Network for Single Stage Detection

https://arxiv.org/abs/1707.05031

Deformable Part-based Fully Convolutional Network for Object Detection

intro: BMVC 2017 (oral). Sorbonne Universités & CEDRIC
arxiv: https://arxiv.org/abs/1707.06175

Adaptive Feeding: Achieving Fast and Accurate Detections by Adaptively Combining Object Detectors

intro: ICCV 2017
arxiv: https://arxiv.org/abs/1707.06399

Recurrent Scale Approximation for Object Detection in CNN

intro: ICCV 2017
keywords: Recurrent Scale Approximation (RSA)
arxiv: https://arxiv.org/abs/1707.09531
github: https://github.com/sciencefans/RSA-for-object-detection

DSOD

DSOD: Learning Deeply Supervised Object Detectors from Scratch

intro: ICCV 2017. Fudan University & Tsinghua University & Intel Labs China
arxiv: https://arxiv.org/abs/1708.01241
github: https://github.com/szq0214/DSOD

Focal Loss for Dense Object Detection

intro: Facebook AI Research
keywords: RetinaNet
arxiv: https://arxiv.org/abs/1708.02002

CoupleNet: Coupling Global Structure with Local Parts for Object Detection

intro: ICCV 2017
arxiv: https://arxiv.org/abs/1708.02863

Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection

https://arxiv.org/abs/1709.04347

StairNet: Top-Down Semantic Aggregation for Accurate One Shot Detection

https://arxiv.org/abs/1709.05788

NMS

End-to-End Integration of a Convolutional Network, Deformable Parts Model and Non-Maximum Suppression

intro: CVPR 2015
arxiv: http://arxiv.org/abs/1411.5309
paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Wan_End-to-End_Integration_of_2015_CVPR_paper.pdf

A convnet for non-maximum suppression

arxiv: http://arxiv.org/abs/1511.06437

Improving Object Detection With One Line of Code

intro: University of Maryland
keywords: Soft-NMS
arxiv: https://arxiv.org/abs/1704.04503
github: https://github.com/bharatsingh430/soft-nms

Learning non-maximum suppression

https://arxiv.org/abs/1705.02950

Weakly Supervised Object Detection

Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection

intro: CVPR 2016
arxiv: http://arxiv.org/abs/1604.05766

Weakly supervised object detection using pseudo-strong labels

arxiv: http://arxiv.org/abs/1607.04731

Saliency Guided End-to-End Learning for Weakly Supervised Object Detection

intro: IJCAI 2017
arxiv: https://arxiv.org/abs/1706.06768

Detection From Video

Learning Object Class Detectors from Weakly Annotated Video

intro: CVPR 2012
paper:https://www.vision.ee.ethz.ch/publications/papers/proceedings/eth_biwi_00905.pdf

Analysing domain shift factors between videos and images for object detection

arxiv: https://arxiv.org/abs/1501.01186

Video Object Recognition

slides:http://vision.princeton.edu/courses/COS598/2015sp/slides/VideoRecog/Video%20Object%20Recognition.pptx

Deep Learning for Saliency Prediction in Natural Video

intro: Submitted on 12 Jan 2016
keywords: Deep learning, saliency map, optical flow, convolution network, contrast features
paper: https://hal.archives-ouvertes.fr/hal-01251614/document

T-CNN

T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos

intro: Winning solution in ILSVRC2015 Object Detection from Video(VID) Task
arxiv: http://arxiv.org/abs/1604.02532
github: https://github.com/myfavouritekk/T-CNN

Object Detection from Video Tubelets with Convolutional Neural Networks

intro: CVPR 2016 Spotlight paper
arxiv: https://arxiv.org/abs/1604.04053
paper: http://www.ee.cuhk.edu.hk/~wlouyang/Papers/KangVideoDet_CVPR16.pdf
gihtub: https://github.com/myfavouritekk/vdetlib

Object Detection in Videos with Tubelets and Multi-context Cues

intro: SenseTime Group
slides: http://www.ee.cuhk.edu.hk/~xgwang/CUvideo.pdf
slides: http://image-net.org/challenges/talks/Object%20Detection%20in%20Videos%20with%20Tubelets%20and%20Multi-context%20Cues%20-%20Final.pdf

Context Matters: Refining Object Detection in Video with Recurrent Neural Networks

intro: BMVC 2016
keywords: pseudo-labeler
arxiv: http://arxiv.org/abs/1607.04648
paper: http://vision.cornell.edu/se3/wp-content/uploads/2016/07/video_object_detection_BMVC.pdf

CNN Based Object Detection in Large Video Images

intro: WangTao @ 爱奇艺
keywords: object retrieval, object detection, scene classification
slides: http://on-demand.gputechconf.com/gtc/2016/presentation/s6362-wang-tao-cnn-based-object-detection-large-video-images.pdf

Object Detection in Videos with Tubelet Proposal Networks

arxiv: https://arxiv.org/abs/1702.06355

Flow-Guided Feature Aggregation for Video Object Detection

intro: MSRA
arxiv: https://arxiv.org/abs/1703.10025

Video Object Detection using Faster R-CNN

blog: http://andrewliao11.github.io/object_detection/faster_rcnn/
github: https://github.com/andrewliao11/py-faster-rcnn-imagenet

Improving Context Modeling for Video Object Detection and Tracking

http://image-net.org/challenges/talks_2017/ilsvrc2017_short(poster).pdf

Temporal Dynamic Graph LSTM for Action-driven Video Object Detection

intro: ICCV 2017
arxiv: https://arxiv.org/abs/1708.00666

Object Detection in 3D

Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks

arxiv: https://arxiv.org/abs/1609.06666

Object Detection on RGB-D

Learning Rich Features from RGB-D Images for Object Detection and Segmentation

arxiv: http://arxiv.org/abs/1407.5736

Differential Geometry Boosts Convolutional Neural Networks for Object Detection

intro: CVPR 2016
paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016_workshops/w23/html/Wang_Differential_Geometry_Boosts_CVPR_2016_paper.html

A Self-supervised Learning System for Object Detection using Physics Simulation and Multi-view Pose Estimation

https://arxiv.org/abs/1703.03347

Salient Object Detection

This task involves predicting the salient regions of an image given by human eye fixations.

Best Deep Saliency Detection Models (CVPR 2016 & 2015)

http://i.cs.hku.hk/~yzyu/vision.html

Large-scale optimization of hierarchical features for saliency prediction in natural images

paper: http://coxlab.org/pdfs/cvpr2014_vig_saliency.pdf

Predicting Eye Fixations using Convolutional Neural Networks

paper: http://www.escience.cn/system/file?fileId=72648

Saliency Detection by Multi-Context Deep Learning

paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Zhao_Saliency_Detection_by_2015_CVPR_paper.pdf

DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection

arxiv: http://arxiv.org/abs/1510.05484

SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection

paper: www.shengfenghe.com/supercnn-a-superpixelwise-convolutional-neural-network-for-salient-object-detection.html

Shallow and Deep Convolutional Networks for Saliency Prediction

intro: CVPR 2016
arxiv: http://arxiv.org/abs/1603.00845
github: https://github.com/imatge-upc/saliency-2016-cvpr

Recurrent Attentional Networks for Saliency Detection

intro: CVPR 2016. recurrent attentional convolutional-deconvolution network (RACDNN)
arxiv: http://arxiv.org/abs/1604.03227

Two-Stream Convolutional Networks for Dynamic Saliency Prediction

arxiv: http://arxiv.org/abs/1607.04730

Unconstrained Salient Object Detection

Unconstrained Salient Object Detection via Proposal Subset Optimization

intro: CVPR 2016
project page: http://cs-people.bu.edu/jmzhang/sod.html
paper: http://cs-people.bu.edu/jmzhang/SOD/CVPR16SOD_camera_ready.pdf
github: https://github.com/jimmie33/SOD
caffe model zoo: https://github.com/BVLC/caffe/wiki/Model-Zoo#cnn-object-proposal-models-for-salient-object-detection

DHSNet: Deep Hierarchical Saliency Network for Salient Object Detection

paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Liu_DHSNet_Deep_Hierarchical_CVPR_2016_paper.pdf

Salient Object Subitizing

intro: CVPR 2015
intro: predicting the existence and the number of salient objects in an image using holistic cues
project page: http://cs-people.bu.edu/jmzhang/sos.html
arxiv: http://arxiv.org/abs/1607.07525
paper: http://cs-people.bu.edu/jmzhang/SOS/SOS_preprint.pdf
caffe model zoo: https://github.com/BVLC/caffe/wiki/Model-Zoo#cnn-models-for-salient-object-subitizing

Deeply-Supervised Recurrent Convolutional Neural Network for Saliency Detection

intro: ACMMM 2016. deeply-supervised recurrent convolutional neural network (DSRCNN)
arxiv: http://arxiv.org/abs/1608.05177

Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs

intro: ECCV 2016
arxiv: http://arxiv.org/abs/1608.05186

Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection

arxiv: http://arxiv.org/abs/1608.08029

A Deep Multi-Level Network for Saliency Prediction

arxiv: http://arxiv.org/abs/1609.01064

Visual Saliency Detection Based on Multiscale Deep CNN Features

intro: IEEE Transactions on Image Processing
arxiv: http://arxiv.org/abs/1609.02077

A Deep Spatial Contextual Long-term Recurrent Convolutional Network for Saliency Detection

intro: DSCLRCN
arxiv: https://arxiv.org/abs/1610.01708

Deeply supervised salient object detection with short connections

arxiv: https://arxiv.org/abs/1611.04849

Weakly Supervised Top-down Salient Object Detection

intro: Nanyang Technological University
arxiv: https://arxiv.org/abs/1611.05345

SalGAN: Visual Saliency Prediction with Generative Adversarial Networks

project page: https://imatge-upc.github.io/saliency-salgan-2017/
arxiv: https://arxiv.org/abs/1701.01081

Visual Saliency Prediction Using a Mixture of Deep Neural Networks

arxiv: https://arxiv.org/abs/1702.00372

A Fast and Compact Salient Score Regression Network Based on Fully Convolutional Network

arxiv: https://arxiv.org/abs/1702.00615

Saliency Detection by Forward and Backward Cues in Deep-CNNs

https://arxiv.org/abs/1703.00152

Supervised Adversarial Networks for Image Saliency Detection

https://arxiv.org/abs/1704.07242

Group-wise Deep Co-saliency Detection

https://arxiv.org/abs/1707.07381

Towards the Success Rate of One: Real-time Unconstrained Salient Object Detection

intro: University of Maryland College Park & eBay Inc
arxiv: https://arxiv.org/abs/1708.00079

Amulet: Aggregating Multi-level Convolutional Features for Salient Object Detection

intro: ICCV 2017
arixv: https://arxiv.org/abs/1708.02001

Learning Uncertain Convolutional Features for Accurate Saliency Detection

intro: Accepted as a poster in ICCV 2017
arxiv: https://arxiv.org/abs/1708.02031

Deep Edge-Aware Saliency Detection

https://arxiv.org/abs/1708.04366

Self-explanatory Deep Salient Object Detection

intro: National University of Defense Technology, China & National University of Singapore
arxiv: https://arxiv.org/abs/1708.05595

PiCANet: Learning Pixel-wise Contextual Attention in ConvNets and Its Application in Saliency Detection

https://arxiv.org/abs/1708.06433

DeepFeat: A Bottom Up and Top Down Saliency Model Based on Deep Features of Convolutional Neural Nets

https://arxiv.org/abs/1709.02495

Saliency Detection in Video

Deep Learning For Video Saliency Detection

arxiv: https://arxiv.org/abs/1702.00871

Video Salient Object Detection Using Spatiotemporal Deep Features

https://arxiv.org/abs/1708.01447

Predicting Video Saliency with Object-to-Motion CNN and Two-layer Convolutional LSTM

https://arxiv.org/abs/1709.06316

Visual Relationship Detection

Visual Relationship Detection with Language Priors

intro: ECCV 2016 oral
paper: https://cs.stanford.edu/people/ranjaykrishna/vrd/vrd.pdf
github: https://github.com/Prof-Lu-Cewu/Visual-Relationship-Detection

ViP-CNN: A Visual Phrase Reasoning Convolutional Neural Network for Visual Relationship Detection

intro: Visual Phrase reasoning Convolutional Neural Network (ViP-CNN), Visual Phrase Reasoning Structure (VPRS)
arxiv: https://arxiv.org/abs/1702.07191

Visual Translation Embedding Network for Visual Relation Detection

arxiv: https://www.arxiv.org/abs/1702.08319

Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection

intro: CVPR 2017 spotlight paper
arxiv: https://arxiv.org/abs/1703.03054

Detecting Visual Relationships with Deep Relational Networks

intro: CVPR 2017 oral. The Chinese University of Hong Kong
arxiv: https://arxiv.org/abs/1704.03114

Identifying Spatial Relations in Images using Convolutional Neural Networks

https://arxiv.org/abs/1706.04215

PPR-FCN: Weakly Supervised Visual Relation Detection via Parallel Pairwise R-FCN

intro: ICCV
arxiv: https://arxiv.org/abs/1708.01956

Specific Object Deteciton

Deep Deformation Network for Object Landmark Localization

arxiv: http://arxiv.org/abs/1605.01014

Fashion Landmark Detection in the Wild

intro: ECCV 2016
project page: http://personal.ie.cuhk.edu.hk/~lz013/projects/FashionLandmarks.html
arxiv: http://arxiv.org/abs/1608.03049
github(Caffe): https://github.com/liuziwei7/fashion-landmarks

Deep Learning for Fast and Accurate Fashion Item Detection

intro: Kuznech Inc.
intro: MultiBox and Fast R-CNN
paper:https://kddfashion2016.mybluemix.net/kddfashion_finalSubmissions/Deep%20Learning%20for%20Fast%20and%20Accurate%20Fashion%20Item%20Detection.pdf

OSMDeepOD - OSM and Deep Learning based Object Detection from Aerial Imagery (formerly known as “OSM-Crosswalk-Detection”)

github: https://github.com/geometalab/OSMDeepOD

Selfie Detection by Synergy-Constraint Based Convolutional Neural Network

intro: IEEE SITIS 2016
arxiv: https://arxiv.org/abs/1611.04357

Associative Embedding:End-to-End Learning for Joint Detection and Grouping

arxiv: https://arxiv.org/abs/1611.05424

Deep Cuboid Detection: Beyond 2D Bounding Boxes

intro: CMU & Magic Leap
arxiv: https://arxiv.org/abs/1611.10010

Automatic Model Based Dataset Generation for Fast and Accurate Crop and Weeds Detection

arxiv: https://arxiv.org/abs/1612.03019

Deep Learning Logo Detection with Data Expansion by Synthesising Context

arxiv: https://arxiv.org/abs/1612.09322

Pixel-wise Ear Detection with Convolutional Encoder-Decoder Networks

arxiv: https://arxiv.org/abs/1702.00307

Automatic Handgun Detection Alarm in Videos Using Deep Learning

arxiv: https://arxiv.org/abs/1702.05147
results: https://github.com/SihamTabik/Pistol-Detection-in-Videos

Using Deep Networks for Drone Detection

intro: AVSS 2017
arxiv: https://arxiv.org/abs/1706.05726

Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection

intro: ICCV 2017
arxiv: https://arxiv.org/abs/1708.01642

DeepVoting: An Explainable Framework for Semantic Part Detection under Partial Occlusion

https://arxiv.org/abs/1709.04577

Fast Shadow Detection from a Single Image Using a Patched Convolutional Neural Network

https://arxiv.org/abs/1709.09283

Face Deteciton

Multi-view Face Detection Using Deep Convolutional Neural Networks

intro: Yahoo
arxiv: http://arxiv.org/abs/1502.02766
github: https://github.com/guoyilin/FaceDetection_CNN

From Facial Parts Responses to Face Detection: A Deep Learning Approach

intro: ICCV 2015. CUHK
project page: http://personal.ie.cuhk.edu.hk/~ys014/projects/Faceness/Faceness.html
arxiv: https://arxiv.org/abs/1509.06451
paper: http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Yang_From_Facial_Parts_ICCV_2015_paper.pdf

Compact Convolutional Neural Network Cascade for Face Detection

arxiv: http://arxiv.org/abs/1508.01292
github: https://github.com/Bkmz21/FD-Evaluation
github: https://github.com/Bkmz21/CompactCNNCascade

Face Detection with End-to-End Integration of a ConvNet and a 3D Model

intro: ECCV 2016
arxiv: https://arxiv.org/abs/1606.00850
github(MXNet): https://github.com/tfwu/FaceDetection-ConvNet-3D

CMS-RCNN: Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection

intro: CMU
arxiv: https://arxiv.org/abs/1606.05413

Finding Tiny Faces

intro: CVPR 2017. CMU
project page: http://www.cs.cmu.edu/~peiyunh/tiny/index.html
arxiv: https://arxiv.org/abs/1612.04402
github: https://github.com/peiyunh/tiny
github(inference-only): https://github.com/chinakook/hr101_mxnet

Towards a Deep Learning Framework for Unconstrained Face Detection

intro: overlap with CMS-RCNN
arxiv: https://arxiv.org/abs/1612.05322

Supervised Transformer Network for Efficient Face Detection

arxiv: http://arxiv.org/abs/1607.05477

UnitBox

UnitBox: An Advanced Object Detection Network

intro: ACM MM 2016
arxiv: http://arxiv.org/abs/1608.01471

Bootstrapping Face Detection with Hard Negative Examples

author: 万韶华 @ 小米.
intro: Faster R-CNN, hard negative mining. state-of-the-art on the FDDB dataset
arxiv: http://arxiv.org/abs/1608.02236

Grid Loss: Detecting Occluded Faces

intro: ECCV 2016
arxiv: https://arxiv.org/abs/1609.00129
paper: http://lrs.icg.tugraz.at/pubs/opitz_eccv_16.pdf
poster: http://www.eccv2016.org/files/posters/P-2A-34.pdf

A Multi-Scale Cascade Fully Convolutional Network Face Detector

intro: ICPR 2016
arxiv: http://arxiv.org/abs/1609.03536

MTCNN

Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks

Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks

project page: https://kpzhang93.github.io/MTCNN_face_detection_alignment/index.html
arxiv: https://arxiv.org/abs/1604.02878
github(Matlab): https://github.com/kpzhang93/MTCNN_face_detection_alignment
github: https://github.com/pangyupo/mxnet_mtcnn_face_detection
github: https://github.com/DaFuCoding/MTCNN_Caffe
github(MXNet): https://github.com/Seanlinx/mtcnn
github: https://github.com/Pi-DeepLearning/RaspberryPi-FaceDetection-MTCNN-Caffe-With-Motion
github(Caffe): https://github.com/foreverYoungGitHub/MTCNN
github: https://github.com/CongWeilin/mtcnn-caffe
github: https://github.com/AlphaQi/MTCNN-light

Face Detection using Deep Learning: An Improved Faster RCNN Approach

intro: DeepIR Inc
arxiv: https://arxiv.org/abs/1701.08289

Faceness-Net: Face Detection through Deep Facial Part Responses

intro: An extended version of ICCV 2015 paper
arxiv: https://arxiv.org/abs/1701.08393

Multi-Path Region-Based Convolutional Neural Network for Accurate Detection of Unconstrained “Hard Faces”

intro: CVPR 2017. MP-RCNN, MP-RPN
arxiv: https://arxiv.org/abs/1703.09145

End-To-End Face Detection and Recognition

https://arxiv.org/abs/1703.10818

Face R-CNN

https://arxiv.org/abs/1706.01061

Face Detection through Scale-Friendly Deep Convolutional Networks

https://arxiv.org/abs/1706.02863

Scale-Aware Face Detection

intro: CVPR 2017. SenseTime & Tsinghua University
arxiv: https://arxiv.org/abs/1706.09876

Multi-Branch Fully Convolutional Network for Face Detection

https://arxiv.org/abs/1707.06330

SSH: Single Stage Headless Face Detector

intro: ICCV 2017
arxiv: https://arxiv.org/abs/1708.03979

Dockerface: an easy to install and use Faster R-CNN face detector in a Docker container

https://arxiv.org/abs/1708.04370

FaceBoxes: A CPU Real-time Face Detector with High Accuracy

intro: IJCB 2017
keywords: Rapidly Digested Convolutional Layers (RDCL), Multiple Scale Convolutional Layers (MSCL)
intro: the proposed detector runs at 20 FPS on a single CPU core and 125 FPS using a GPU for VGA-resolution images
arxiv: https://arxiv.org/abs/1708.05234

S3FD: Single Shot Scale-invariant Face Detector

intro: ICCV 2017
intro: can run at 36 FPS on a Nvidia Titan X (Pascal) for VGA-resolution images
arxiv: https://arxiv.org/abs/1708.05237

Detecting Faces Using Region-based Fully Convolutional Networks

https://arxiv.org/abs/1709.05256

AffordanceNet: An End-to-End Deep Learning Approach for Object Affordance Detection

https://arxiv.org/abs/1709.07326

Facial Point / Landmark Detection

Deep Convolutional Network Cascade for Facial Point Detection

homepage: http://mmlab.ie.cuhk.edu.hk/archive/CNN_FacePoint.htm
paper: http://www.ee.cuhk.edu.hk/~xgwang/papers/sunWTcvpr13.pdf
github: https://github.com/luoyetx/deep-landmark

Facial Landmark Detection by Deep Multi-task Learning

intro: ECCV 2014
project page: http://mmlab.ie.cuhk.edu.hk/projects/TCDCN.html
paper: http://personal.ie.cuhk.edu.hk/~ccloy/files/eccv_2014_deepfacealign.pdf
github(Matlab): https://github.com/zhzhanp/TCDCN-face-alignment

A Recurrent Encoder-Decoder Network for Sequential Face Alignment

intro: ECCV 2016
arxiv: https://arxiv.org/abs/1608.05477

Detecting facial landmarks in the video based on a hybrid framework

arxiv: http://arxiv.org/abs/1609.06441

Deep Constrained Local Models for Facial Landmark Detection

arxiv: https://arxiv.org/abs/1611.08657

Effective face landmark localization via single deep network

arxiv: https://arxiv.org/abs/1702.02719

A Convolution Tree with Deconvolution Branches: Exploiting Geometric Relationships for Single Shot Keypoint Detection

https://arxiv.org/abs/1704.01880

Deep Alignment Network: A convolutional neural network for robust face alignment

intro: CVPRW 2017
arxiv: https://arxiv.org/abs/1706.01789
gihtub: https://github.com/MarekKowalski/DeepAlignmentNetwork

Joint Multi-view Face Alignment in the Wild

https://arxiv.org/abs/1708.06023

FacePoseNet: Making a Case for Landmark-Free Face Alignment

https://arxiv.org/abs/1708.07517

People Detection

End-to-end people detection in crowded scenes

arxiv: http://arxiv.org/abs/1506.04878
github: https://github.com/Russell91/reinspect
ipn:http://nbviewer.ipython.org/github/Russell91/ReInspect/blob/master/evaluation_reinspect.ipynb
youtube: https://www.youtube.com/watch?v=QeWl0h3kQ24

Detecting People in Artwork with CNNs

intro: ECCV 2016 Workshops
arxiv: https://arxiv.org/abs/1610.08871

Deep Multi-camera People Detection

arxiv: https://arxiv.org/abs/1702.04593

Person Head Detection

Context-aware CNNs for person head detection

intro: ICCV 2015
project page: http://www.di.ens.fr/willow/research/headdetection/
arxiv: http://arxiv.org/abs/1511.07917
github: https://github.com/aosokin/cnn_head_detection

Pedestrian Detection

Pedestrian Detection aided by Deep Learning Semantic Tasks

intro: CVPR 2015
project page: http://mmlab.ie.cuhk.edu.hk/projects/TA-CNN/
arxiv: http://arxiv.org/abs/1412.0069

Deep Learning Strong Parts for Pedestrian Detection

intro: ICCV 2015. CUHK. DeepParts
intro: Achieving 11.89% average miss rate on Caltech Pedestrian Dataset
paper: http://personal.ie.cuhk.edu.hk/~pluo/pdf/tianLWTiccv15.pdf

Taking a Deeper Look at Pedestrians

intro: CVPR 2015
arxiv: https://arxiv.org/abs/1501.05790

Convolutional Channel Features

intro: ICCV 2015
arxiv: https://arxiv.org/abs/1504.07339
github: https://github.com/byangderek/CCF

Learning Complexity-Aware Cascades for Deep Pedestrian Detection

intro: ICCV 2015
arxiv: https://arxiv.org/abs/1507.05348

Deep convolutional neural networks for pedestrian detection

arxiv: http://arxiv.org/abs/1510.03608
github: https://github.com/DenisTome/DeepPed

Scale-aware Fast R-CNN for Pedestrian Detection

arxiv: https://arxiv.org/abs/1510.08160

New algorithm improves speed and accuracy of pedestrian detection

blog: http://www.eurekalert.org/pub_releases/2016-02/uoc–nai020516.php

Pushing the Limits of Deep CNNs for Pedestrian Detection

intro: “set a new record on the Caltech pedestrian dataset, lowering the log-average miss rate from 11.7% to 8.9%”
arxiv: http://arxiv.org/abs/1603.04525

A Real-Time Deep Learning Pedestrian Detector for Robot Navigation

arxiv: http://arxiv.org/abs/1607.04436

A Real-Time Pedestrian Detector using Deep Learning for Human-Aware Navigation

arxiv: http://arxiv.org/abs/1607.04441

Is Faster R-CNN Doing Well for Pedestrian Detection?

intro: ECCV 2016
arxiv: http://arxiv.org/abs/1607.07032
github: https://github.com/zhangliliang/RPN_BF/tree/RPN-pedestrian

Reduced Memory Region Based Deep Convolutional Neural Network Detection

intro: IEEE 2016 ICCE-Berlin
arxiv: http://arxiv.org/abs/1609.02500

Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection

arxiv: https://arxiv.org/abs/1610.03466

Multispectral Deep Neural Networks for Pedestrian Detection

intro: BMVC 2016 oral
arxiv: https://arxiv.org/abs/1611.02644

Expecting the Unexpected: Training Detectors for Unusual Pedestrians with Adversarial Imposters

intro: CVPR 2017
project page: http://ml.cs.tsinghua.edu.cn:5000/publications/synunity/
arxiv: https://arxiv.org/abs/1703.06283
github(Tensorflow): https://github.com/huangshiyu13/RPNplus

Illuminating Pedestrians via Simultaneous Detection & Segmentation

[https://arxiv.org/abs/1706.08564](https://arxiv.org/abs/1706.08564

Rotational Rectification Network for Robust Pedestrian Detection

intro: CMU & Volvo Construction
arxiv: https://arxiv.org/abs/1706.08917

STD-PD: Generating Synthetic Training Data for Pedestrian Detection in Unannotated Videos

intro: The University of North Carolina at Chapel Hill
arxiv: https://arxiv.org/abs/1707.09100

Too Far to See? Not Really! — Pedestrian Detection with Scale-aware Localization Policy

https://arxiv.org/abs/1709.00235

Vehicle Detection

DAVE: A Unified Framework for Fast Vehicle Detection and Annotation

intro: ECCV 2016
arxiv: http://arxiv.org/abs/1607.04564

Evolving Boxes for fast Vehicle Detection

arxiv: https://arxiv.org/abs/1702.00254

Fine-Grained Car Detection for Visual Census Estimation

intro: AAAI 2016
arxiv: https://arxiv.org/abs/1709.02480

Traffic-Sign Detection

Traffic-Sign Detection and Classification in the Wild

project page(code+dataset): http://cg.cs.tsinghua.edu.cn/traffic-sign/
paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Zhu_Traffic-Sign_Detection_and_CVPR_2016_paper.pdf
code & model: http://cg.cs.tsinghua.edu.cn/traffic-sign/data_model_code/newdata0411.zip

Detecting Small Signs from Large Images

intro: IEEE Conference on Information Reuse and Integration (IRI) 2017 oral
arxiv: https://arxiv.org/abs/1706.08574

Boundary / Edge / Contour Detection

Holistically-Nested Edge Detection

intro: ICCV 2015, Marr Prize
paper: http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Xie_Holistically-Nested_Edge_Detection_ICCV_2015_paper.pdf
arxiv: http://arxiv.org/abs/1504.06375
github: https://github.com/s9xie/hed

Unsupervised Learning of Edges

intro: CVPR 2016. Facebook AI Research
arxiv: http://arxiv.org/abs/1511.04166
zn-blog: http://www.leiphone.com/news/201607/b1trsg9j6GSMnjOP.html

Pushing the Boundaries of Boundary Detection using Deep Learning

arxiv: http://arxiv.org/abs/1511.07386

Convolutional Oriented Boundaries

intro: ECCV 2016
arxiv: http://arxiv.org/abs/1608.02755

Convolutional Oriented Boundaries: From Image Segmentation to High-Level Tasks

project page: http://www.vision.ee.ethz.ch/~cvlsegmentation/
arxiv: https://arxiv.org/abs/1701.04658
github: https://github.com/kmaninis/COB

Richer Convolutional Features for Edge Detection

intro: CVPR 2017
keywords: richer convolutional features (RCF)
arxiv: https://arxiv.org/abs/1612.02103
github: https://github.com/yun-liu/rcf

Contour Detection from Deep Patch-level Boundary Prediction

https://arxiv.org/abs/1705.03159

CASENet: Deep Category-Aware Semantic Edge Detection

intro: CVPR 2017
arxiv: https://arxiv.org/abs/1705.09759

Skeleton Detection

Object Skeleton Extraction in Natural Images by Fusing Scale-associated Deep Side Outputs

arxiv: http://arxiv.org/abs/1603.09446
github: https://github.com/zeakey/DeepSkeleton

DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images

arxiv: http://arxiv.org/abs/1609.03659

SRN: Side-output Residual Network for Object Symmetry Detection in the Wild

intro: CVPR 2017
arxiv: https://arxiv.org/abs/1703.02243
github: https://github.com/KevinKecc/SRN

Fruit Detection

Deep Fruit Detection in Orchards

arxiv: https://arxiv.org/abs/1610.03677

Image Segmentation for Fruit Detection and Yield Estimation in Apple Orchards

intro: The Journal of Field Robotics in May 2016
project page: http://confluence.acfr.usyd.edu.au/display/AGPub/
arxiv: https://arxiv.org/abs/1610.08120

Part Detection

Objects as context for part detection

https://arxiv.org/abs/1703.09529

Object Proposal

DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers

arxiv: http://arxiv.org/abs/1510.04445
github: https://github.com/aghodrati/deepproposal

Scale-aware Pixel-wise Object Proposal Networks

intro: IEEE Transactions on Image Processing
arxiv: http://arxiv.org/abs/1601.04798

Attend Refine Repeat: Active Box Proposal Generation via In-Out Localization

intro: BMVC 2016. AttractioNet
arxiv: https://arxiv.org/abs/1606.04446
github: https://github.com/gidariss/AttractioNet

Learning to Segment Object Proposals via Recursive Neural Networks

arxiv: https://arxiv.org/abs/1612.01057

Learning Detection with Diverse Proposals

intro: CVPR 2017
keywords: differentiable Determinantal Point Process (DPP) layer, Learning Detection with Diverse Proposals (LDDP)
arxiv: https://arxiv.org/abs/1704.03533

ScaleNet: Guiding Object Proposal Generation in Supermarkets and Beyond

keywords: product detection
arxiv: https://arxiv.org/abs/1704.06752

Improving Small Object Proposals for Company Logo Detection

intro: ICMR 2017
arxiv: https://arxiv.org/abs/1704.08881

Localization

Beyond Bounding Boxes: Precise Localization of Objects in Images

intro: PhD Thesis
homepage: http://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-193.html
phd-thesis: http://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-193.pdf
github(“SDS using hypercolumns”): https://github.com/bharath272/sds

Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning

arxiv: http://arxiv.org/abs/1503.00949

Weakly Supervised Object Localization Using Size Estimates

arxiv: http://arxiv.org/abs/1608.04314

Active Object Localization with Deep Reinforcement Learning

intro: ICCV 2015
keywords: Markov Decision Process
arxiv: https://arxiv.org/abs/1511.06015

Localizing objects using referring expressions

intro: ECCV 2016
keywords: LSTM, multiple instance learning (MIL)
paper: http://www.umiacs.umd.edu/~varun/files/refexp-ECCV16.pdf
github: https://github.com/varun-nagaraja/referring-expressions

LocNet: Improving Localization Accuracy for Object Detection

intro: CVPR 2016 oral
arxiv: http://arxiv.org/abs/1511.07763
github: https://github.com/gidariss/LocNet

Learning Deep Features for Discriminative Localization

homepage: http://cnnlocalization.csail.mit.edu/
arxiv: http://arxiv.org/abs/1512.04150
github(Tensorflow): https://github.com/jazzsaxmafia/Weakly_detector
github: https://github.com/metalbubble/CAM
github: https://github.com/tdeboissiere/VGG16CAM-keras

ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization

intro: ECCV 2016
project page: http://www.di.ens.fr/willow/research/contextlocnet/
arxiv: http://arxiv.org/abs/1609.04331
github: https://github.com/vadimkantorov/contextlocnet

Ensemble of Part Detectors for Simultaneous Classification and Localization

https://arxiv.org/abs/1705.10034

STNet: Selective Tuning of Convolutional Networks for Object Localization

https://arxiv.org/abs/1708.06418

Soft Proposal Networks for Weakly Supervised Object Localization

intro: ICCV 2017
arxiv: https://arxiv.org/abs/1709.01829

Fine-grained Discriminative Localization via Saliency-guided Faster R-CNN

intro: ACM MM 2017
arxiv: https://arxiv.org/abs/1709.08295

Tutorials / Talks

Convolutional Feature Maps: Elements of efficient (and accurate) CNN-based object detection

slides: http://research.microsoft.com/en-us/um/people/kahe/iccv15tutorial/iccv2015_tutorial_convolutional_feature_maps_kaiminghe.pdf

Towards Good Practices for Recognition & Detection

intro: Hikvision Research Institute. Supervised Data Augmentation (SDA)
slides: http://image-net.org/challenges/talks/2016/Hikvision_at_ImageNet_2016.pdf

Projects

TensorBox: a simple framework for training neural networks to detect objects in images

intro: “The basic model implements the simple and robust GoogLeNet-OverFeat algorithm. We additionally provide an implementation of the ReInspect algorithm”
github: https://github.com/Russell91/TensorBox

Object detection in torch: Implementation of some object detection frameworks in torch

github: https://github.com/fmassa/object-detection.torch

Using DIGITS to train an Object Detection network

github: https://github.com/NVIDIA/DIGITS/blob/master/examples/object-detection/README.md

FCN-MultiBox Detector

intro: Full convolution MultiBox Detector (like SSD) implemented in Torch.
github: https://github.com/teaonly/FMD.torch

KittiBox: A car detection model implemented in Tensorflow.

keywords: MultiNet
intro: KittiBox is a collection of scripts to train out model FastBox on the Kitti Object Detection Dataset
github: https://github.com/MarvinTeichmann/KittiBox

Deformable Convolutional Networks + MST + Soft-NMS

github: https://github.com/bharatsingh430/Deformable-ConvNets

Tools

BeaverDam: Video annotation tool for deep learning training labels

https://github.com/antingshen/BeaverDam

Blogs

Convolutional Neural Networks for Object Detection

http://rnd.azoft.com/convolutional-neural-networks-object-detection/

Introducing automatic object detection to visual search (Pinterest)

keywords: Faster R-CNN
blog: https://engineering.pinterest.com/blog/introducing-automatic-object-detection-visual-search
demo:https://engineering.pinterest.com/sites/engineering/files/Visual%20Search%20V1%20-%20Video.mp4
review: https://news.developer.nvidia.com/pinterest-introduces-the-future-of-visual-search/?mkt_tok=eyJpIjoiTnpaa01UWXpPRE0xTURFMiIsInQiOiJJRjcybjkwTmtmallORUhLOFFFODBDclFqUlB3SWlRVXJXb1MrQ013TDRIMGxLQWlBczFIeWg0TFRUdnN2UHY2ZWFiXC9QQVwvQzBHM3B0UzBZblpOSmUyU1FcLzNPWXI4cml2VERwTTJsOFwvOEk9In0%3D

Deep Learning for Object Detection with DIGITS

blog: https://devblogs.nvidia.com/parallelforall/deep-learning-object-detection-digits/

Analyzing The Papers Behind Facebook’s Computer Vision Approach

keywords: DeepMask, SharpMask, MultiPathNet
blog: https://adeshpande3.github.io/adeshpande3.github.io/Analyzing-the-Papers-Behind-Facebook’s-Computer-Vision-Approach/

Easily Create High Quality Object Detectors with Deep Learning

intro: dlib v19.2
blog: http://blog.dlib.net/2016/10/easily-create-high-quality-object.html

How to Train a Deep-Learned Object Detection Model in the Microsoft Cognitive Toolkit

blog: https://blogs.technet.microsoft.com/machinelearning/2016/10/25/how-to-train-a-deep-learned-object-detection-model-in-cntk/
github:https://github.com/Microsoft/CNTK/tree/master/Examples/Image/Detection/FastRCNN

Object Detection in Satellite Imagery, a Low Overhead Approach

part 1: https://medium.com/the-downlinq/object-detection-in-satellite-imagery-a-low-overhead-approach-part-i-cbd96154a1b7#.2csh4iwx9
part 2: https://medium.com/the-downlinq/object-detection-in-satellite-imagery-a-low-overhead-approach-part-ii-893f40122f92#.f9b7dgf64

You Only Look Twice — Multi-Scale Object Detection in Satellite Imagery With Convolutional Neural Networks

part 1: https://medium.com/the-downlinq/you-only-look-twice-multi-scale-object-detection-in-satellite-imagery-with-convolutional-neural-38dad1cf7571#.fmmi2o3of
part 2: https://medium.com/the-downlinq/you-only-look-twice-multi-scale-object-detection-in-satellite-imagery-with-convolutional-neural-34f72f659588#.nwzarsz1t

Faster R-CNN Pedestrian and Car Detection

blog: https://bigsnarf.wordpress.com/2016/11/07/faster-r-cnn-pedestrian-and-car-detection/
ipn: https://gist.github.com/bigsnarfdude/2f7b2144065f6056892a98495644d3e0#file-demo_faster_rcnn_notebook-ipynb
github: https://github.com/bigsnarfdude/Faster-RCNN_TF

Small U-Net for vehicle detection

blog: https://medium.com/@vivek.yadav/small-u-net-for-vehicle-detection-9eec216f9fd6#.md4u80kad

Region of interest pooling explained

blog: https://deepsense.io/region-of-interest-pooling-explained/
github: https://github.com/deepsense-io/roi-pooling

Supercharge your Computer Vision models with the TensorFlow Object Detection API

blog: https://research.googleblog.com/2017/06/supercharge-your-computer-vision-models.html
github: https://github.com/tensorflow/models/tree/master/object_detection

« Natural Language Processing

OCR 转自https://handong1587.github.io/deep_learning/2015/10/09/object-detection.html

»

你可能感兴趣的:(深度学习)

机器学习与深度学习间关系与区别 ℒℴѵℯ心·动ꦿ໊ོ꫞ 人工智能学习深度学习 python
一、机器学习概述定义机器学习（MachineLearning,ML）是一种通过数据驱动的方法，利用统计学和计算算法来训练模型，使计算机能够从数据中学习并自动进行预测或决策。机器学习通过分析大量数据样本，识别其中的模式和规律，从而对新的数据进行判断。其核心在于通过训练过程，让模型不断优化和提升其预测准确性。主要类型1.监督学习（SupervisedLearning）监督学习是指在训练数据集中包含输入
将cmd中命令输出保存为txt文本文件落难Coder Windows cmd window
最近深度学习本地的训练中我们常常要在命令行中运行自己的代码，无可厚非，我们有必要保存我们的炼丹结果，但是复制命令行输出到txt是非常麻烦的，其实Windows下的命令行为我们提供了相应的操作。其基本的调用格式就是：运行指令>输出到的文件名称或者具体保存路径测试下，我打开cmd并且ping一下百度：pingwww.baidu.com>./data.txt看下相同目录下data.txt的输出：如果你再
推荐3家毕业AI论文可五分钟一键生成！文末附免费教程！小猪包333 写论文人工智能 AI写作深度学习计算机视觉
在当前的学术研究和写作领域，AI论文生成器已经成为许多研究人员和学生的重要工具。这些工具不仅能够帮助用户快速生成高质量的论文内容，还能进行内容优化、查重和排版等操作。以下是三款值得推荐的AI论文生成器：千笔-AIPassPaper、懒人论文以及AIPaperPass。千笔-AIPassPaper千笔-AIPassPaper是一款基于深度学习和自然语言处理技术的AI写作助手，旨在帮助用户快速生成高质
AI大模型的架构演进与最新发展季风泯灭的季节 AI大模型应用技术二人工智能架构
随着深度学习的发展，AI大模型（LargeLanguageModels,LLMs）在自然语言处理、计算机视觉等领域取得了革命性的进展。本文将详细探讨AI大模型的架构演进，包括从Transformer的提出到GPT、BERT、T5等模型的历史演变，并探讨这些模型的技术细节及其在现代人工智能中的核心作用。一、基础模型介绍：Transformer的核心原理Transformer架构的背景在Transfo
[实践应用] 深度学习之模型性能评估指标 YuanDaima2048 深度学习工具使用深度学习人工智能损失函数性能评估 pytorch python 机器学习
文章总览：YuanDaiMa2048博客文章总览深度学习之模型性能评估指标分类任务回归任务排序任务聚类任务生成任务其他介绍在机器学习和深度学习领域，评估模型性能是一项至关重要的任务。不同的学习任务需要不同的性能指标来衡量模型的有效性。以下是对一些常见任务及其相应的性能评估指标的详细解释和总结。分类任务分类任务是指模型需要将输入数据分配到预定义的类别或标签中。以下是分类任务中常用的性能指标：准确率(
[实践应用] 深度学习之优化器 YuanDaima2048 深度学习工具使用 pytorch 深度学习人工智能机器学习 python 优化器
文章总览：YuanDaiMa2048博客文章总览深度学习之优化器1.随机梯度下降（SGD）2.动量优化（Momentum）3.自适应梯度（Adagrad）4.自适应矩估计（Adam）5.RMSprop总结其他介绍在深度学习中，优化器用于更新模型的参数，以最小化损失函数。常见的优化函数有很多种，下面是几种主流的优化器及其特点、原理和PyTorch实现：1.随机梯度下降（SGD）原理:随机梯度下降通过
生成式地图制图 Bwywb_3 深度学习机器学习深度学习生成对抗网络
生成式地图制图（GenerativeCartography）是一种利用生成式算法和人工智能技术自动创建地图的技术。它结合了传统的地理信息系统（GIS）技术与现代生成模型（如深度学习、GANs等），能够根据输入的数据自动生成符合需求的地图。这种方法在城市规划、虚拟环境设计、游戏开发等多个领域具有应用前景。主要特点：自动化生成：通过算法和模型，系统能够根据输入的地理或空间数据自动生成地图，而无需人工逐
吴恩达深度学习笔记(30)-正则化的解释极客Array
正则化（Regularization）深度学习可能存在过拟合问题——高方差，有两个解决方法，一个是正则化，另一个是准备更多的数据，这是非常可靠的方法，但你可能无法时时刻刻准备足够多的训练数据或者获取更多数据的成本很高，但正则化通常有助于避免过拟合或减少你的网络误差。如果你怀疑神经网络过度拟合了数据，即存在高方差问题，那么最先想到的方法可能是正则化，另一个解决高方差的方法就是准备更多数据，这也是非常
个人学习笔记7-6：动手学深度学习pytorch版-李沐浪子L 深度学习深度学习笔记计算机视觉 python 人工智能神经网络 pytorch
#人工智能##深度学习##语义分割##计算机视觉##神经网络#计算机视觉13.11全卷积网络全卷积网络（fullyconvolutionalnetwork，FCN）采用卷积神经网络实现了从图像像素到像素类别的变换。引入l转置卷积（transposedconvolution）实现的，输出的类别预测与输入图像在像素级别上具有一一对应关系：通道维的输出即该位置对应像素的类别预测。13.11.1构造模型下
深度学习-点击率预估-研究论文2024-09-14速读 sp_fyf_2024 深度学习人工智能
深度学习-点击率预估-研究论文2024-09-14速读1.DeepTargetSessionInterestNetworkforClick-ThroughRatePredictionHZhong,JMa,XDuan,SGu,JYao-2024InternationalJointConferenceonNeuralNetworks,2024深度目标会话兴趣网络用于点击率预测摘要：这篇文章提出了一种新
损失函数与反向传播 Star_. PyTorch pytorch 深度学习 python
损失函数定义与作用损失函数(lossfunction)在深度学习领域是用来计算搭建模型预测的输出值和真实值之间的误差。1.损失函数越小越好2.计算实际输出与目标之间的差距3.为更新输出提供依据（反向传播)常见的损失函数回归常见的损失函数有：均方差（MeanSquaredError，MSE）、平均绝对误差（MeanAbsoluteErrorLoss，MAE）、HuberLoss是一种将MSE与MAE
【深度学习】训练过程中一个OOM的问题，太难查了 weixin_40293999 深度学习深度学习人工智能
现象：各位大佬又遇到过ubuntu的这个问题么？现象是在训练过程中，ssh上不去了，能ping通，没死机，但是ubunutu的pc侧的显示器，鼠标啥都不好用了。只能重启。问题原因：OOM了95G，尼玛！！！！pytorch爆内存了，然后journald假死了，在journald被watchdog干掉之后，系统就崩溃了。这种规模的爆内存一般，即使被oomkill了，也要卡半天的，确实会这样，能不能配
云服务业界动态简报-20180128 Captain7
一、青云青云QingCloud推出深度学习平台DeepLearningonQingCloud，包含了主流的深度学习框架及数据科学工具包，通过QingCloudAppCenter一键部署交付，可以让算法工程师和数据科学家快速构建深度学习开发环境，将更多的精力放在模型和算法调优。二、腾讯云1.腾讯云正式发布腾讯专有云TCE(TencentCloudEnterprise)矩阵，涵盖企业版、大数据版、AI
机器学习VS深度学习 nfgo 机器学习
机器学习（MachineLearning,ML）和深度学习（DeepLearning,DL）是人工智能（AI）的两个子领域，它们有许多相似之处，但在技术实现和应用范围上也有显著区别。下面从几个方面对两者进行区分：1.概念层面机器学习：是让计算机通过算法从数据中自动学习和改进的技术。它依赖于手动设计的特征和数学模型来进行学习，常用的模型有决策树、支持向量机、线性回归等。深度学习：是机器学习的一个子领
大数据毕业设计hadoop+spark+hive知识图谱租房数据分析可视化大屏租房推荐系统 58同城租房爬虫房源推荐系统房价预测系统计算机毕业设计机器学习深度学习人工智能 2401_84572577 程序员大数据 hadoop 人工智能
做了那么多年开发，自学了很多门编程语言，我很明白学习资源对于学一门新语言的重要性，这些年也收藏了不少的Python干货，对我来说这些东西确实已经用不到了，但对于准备自学Python的人来说，或许它就是一个宝藏，可以给你省去很多的时间和精力。别在网上瞎学了，我最近也做了一些资源的更新，只要你是我的粉丝，这期福利你都可拿走。我先来介绍一下这些东西怎么用，文末抱走。（1）Python所有方向的学习路线（
深度学习-13-小语言模型之SmolLM的使用皮皮冰燃深度学习深度学习
文章附录1SmolLM概述1.1SmolLM简介1.2下载模型2运行2.1在CPU/GPU/多GPU上运行模型2.2使用torch.bfloat162.3通过位和字节的量化版本3应用示例4问题及解决4.1attention_mask和pad_token_id报错4.2max_new_tokens=205参考附录1SmolLM概述1.1SmolLM简介SmolLM是一系列尖端小型语言模型，提供三种规
基于深度学习的农作物病害检测 SEU-WYL 深度学习dnn 深度学习人工智能
基于深度学习的农作物病害检测利用卷积神经网络（CNN）、生成对抗网络（GAN）、Transformer等深度学习技术，自动识别和分类农作物的病害，帮助农业工作者提高作物管理效率、减少损失。1.农作物病害检测的挑战病害种类繁多：农作物病害的类型多样，不同病害在同一作物上的表现差异很大，同时同一种病害在不同生长阶段的症状也可能不同。环境影响：天气、光照、湿度等外部环境因素会影响农作物的表现，使得病害检
基于深度学习的文本引导的图像编辑 SEU-WYL 深度学习dnn 深度学习人工智能
基于深度学习的文本引导的图像编辑（Text-GuidedImageEditing）是一种通过自然语言文本指令对图像进行编辑或修改的技术。它结合了图像生成和自然语言处理（NLP）的最新进展，使用户能够通过描述性文本对图像内容进行精确的调整和操控。1.文本引导的图像编辑的挑战文本和图像之间的对齐：如何将文本中的语义信息准确地映射到图像中的特定区域或元素是一个关键挑战。这涉及到多模态数据的对齐和理解。编
深度学习--对抗生成网络（GAN, Generative Adversarial Network） Ambition_LAO 深度学习生成对抗网络
对抗生成网络（GAN,GenerativeAdversarialNetwork）是一种深度学习模型，由IanGoodfellow等人在2014年提出。GAN主要用于生成数据，通过两个神经网络相互对抗，来生成以假乱真的新数据。以下是对GAN的详细阐述，包括其概念、作用、核心要点、实现过程、代码实现和适用场景。1.概念GAN由两个神经网络组成：生成器（Generator）和判别器（Discrimina
深度学习：怎么看pth文件的参数奥利给少年深度学习人工智能
.pth文件是PyTorch模型的权重文件，它通常包含了训练好的模型的参数。要查看或使用这个文件，你可以按照以下步骤操作：1.确保你有模型的定义你需要有创建这个.pth文件时所用的模型的代码。这意味着你需要有模型的类定义和架构。2.加载模型权重使用PyTorch的load_state_dict方法来加载权重。这里是如何操作的：importtorchimporttorch.nnasnn#定义模型结构
chatgpt赋能python：如何在Python中安装Keras库？ turensu ChatGpt python chatgpt keras 计算机
如何在Python中安装Keras库？Keras是一个简单易用的神经网络库，由FrançoisChollet编写。它在Python编程语言中实现了深度学习的功能，可以使您更轻松地构建和试验不同类型的神经网络。如果您是一名Python开发人员，肯定会想知道如何在您的Python项目中安装Keras库。在本文中，我们将向您展示如何安装和配置Keras库。步骤1：安装Python要使用Keras库，您需
如何理解深度学习的训练过程奋斗的草莓熊深度学习人工智能 python scikit-learn virtualenv numpy pandas
文章目录1.训练是干什么？2.预训练模型进行训练，主要更改的是预训练模型的什么东西？1.训练是干什么？以yolov5为例子，训练的目的是把一组输入猫狗图像放到神经网络中，得到一个输出模型，这个模型下次可以直接用来识别哪个是猫，哪个是狗2.预训练模型进行训练，主要更改的是预训练模型的什么东西？超参数（Hyperparameters）：这是模型结构中定义的参数，比如：卷积核大小（kernel_size
Keras深度学习框架入门及实战指南司莹嫣Maude
Keras深度学习框架入门及实战指南keraskeras-team/keras:是一个基于Python的深度学习库，它没有使用数据库。适合用于深度学习任务的开发和实现，特别是对于需要使用Python深度学习库的场景。特点是深度学习库、Python、无数据库。项目地址:https://gitcode.com/gh_mirrors/ke/keras一、项目介绍Keras简介Keras是一款高级神经网络
深度学习驱动的车牌识别：技术演进与未来挑战逼子歌深度学习车牌识别神经网络字符识别 YOLO 卷积神经网络
一、引言1.1研究背景在当今社会，智能交通系统的发展日益重要，而车牌识别作为其关键组成部分，发挥着至关重要的作用。车牌识别技术广泛应用于交通管理、停车场管理、安防监控等领域。在交通管理中，它可以用于车辆识别、交通违法监控和车流统计等，提高交通管理的效率和准确性。在停车场管理中，实现车辆的自动识别和收费，提升管理和服务水平。在安防监控领域，可用于追踪嫌疑人及犯罪行为。深度学习的出现为车牌识别带来了重
每天五分钟玩转深度学习PyTorch：模型参数优化器torch.optim 幻风_huanfeng 深度学习框架pytorch 深度学习 pytorch 人工智能神经网络机器学习优化算法
本文重点在机器学习或者深度学习中，我们需要通过修改参数使得损失函数最小化(或最大化)，优化算法就是一种调整模型参数更新的策略。在pytorch中定义了优化器optim，我们可以使用它调用封装好的优化算法，然后传递给它神经网络模型参数，就可以对模型进行优化。本文是学习第6步(优化器)，参考链接pytorch的学习路线随机梯度下降算法在深度学习和机器学习中，梯度下降算法是最常用的参数更新方法，它的公式
什么是AIGC？有哪些免费工具？ chent_某位 AIGC
AIGC（AIGeneratedContent），即“人工智能生成内容”，是指通过人工智能技术自动生成各种类型的数字内容。AIGC让机器能够根据输入的信息或数据生成符合人类需求的文本、图像、音频、视频等内容，极大提高了内容创作的效率。AIGC的背景与起源随着深度学习和自然语言处理技术的快速发展，人工智能已经不再局限于简单的任务，如分类、预测和数据分析，而是具备了生成内容的能力。生成式AI模型，如O
transformer架构(Transformer Architecture)原理与代码实战案例讲解 AI架构设计之禅大数据AI人工智能 Python入门实战计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
transformer架构(TransformerArchitecture)原理与代码实战案例讲解关键词：Transformer,自注意力机制,编码器-解码器,预训练,微调,NLP,机器翻译作者：禅与计算机程序设计艺术/ZenandtheArtofComputerProgramming1.背景介绍1.1问题的由来自然语言处理（NLP）领域的发展经历了从规则驱动到统计驱动再到深度学习驱动的三个阶段。
如何有效的学习AI大模型？ Python程序员罗宾学习人工智能语言模型自然语言处理架构
学习AI大模型是一个系统性的过程，涉及到多个学科的知识。以下是一些建议，帮助你更有效地学习AI大模型：基础知识储备：数学基础：学习线性代数、概率论、统计学和微积分等，这些是理解机器学习算法的数学基础。编程技能：掌握至少一种编程语言，如Python，因为大多数AI模型都是用Python实现的。理论学习：机器学习基础：了解监督学习、非监督学习、强化学习等基本概念。深度学习：学习神经网络的基本结构，如卷
【深度学习】【OnnxRuntime】【Python】模型转化、环境搭建以及模型部署的详细教程牙牙要健康深度学习 onnx onnxruntime 深度学习 python 人工智能
【深度学习】【OnnxRuntime】【Python】模型转化、环境搭建以及模型部署的详细教程提示:博主取舍了很多大佬的博文并亲测有效,分享笔记邀大家共同学习讨论文章目录【深度学习】【OnnxRuntime】【Python】模型转化、环境搭建以及模型部署的详细教程前言模型转换--pytorch转onnxWindows平台搭建依赖环境onnxruntime调用onnx模型ONNXRuntime推理核
基于深度学习的多模态信息检索 SEU-WYL 深度学习dnn 深度学习人工智能
基于深度学习的多模态信息检索（MultimodalInformationRetrieval,MMIR）是指利用深度学习技术，从包含多种模态（如文本、图像、视频、音频等）的数据集中检索出满足用户查询意图的相关信息。这种方法不仅可以处理单一模态的数据，还可以在多种模态之间建立关联，从而更准确地满足用户需求。1.多模态信息检索的挑战异构数据表示：多模态数据通常具有不同的特征和表示形式（如文本的词嵌入与图
java类加载顺序 3213213333332132 java
package com.demo; /** * @Description 类加载顺序 * @author FuJianyong * 2015-2-6上午11:21:37 */ public class ClassLoaderSequence { String s1 = "成员属性"; static String s2 = "
Hibernate与mybitas的比较 BlueSkator sql Hibernate 框架 ibatis orm
第一章 Hibernate与MyBatis Hibernate 是当前最流行的O/R mapping框架，它出身于sf.net，现在已经成为Jboss的一部分。 Mybatis 是另外一种优秀的O/R mapping框架。目前属于apache的一个子项目。 MyBatis 参考资料官网：http:
php多维数组排序以及实际工作中的应用 dcj3sjt126com PHP usort uasort
自定义排序函数返回false或负数意味着第一个参数应该排在第二个参数的前面, 正数或true反之, 0相等usort不保存键名uasort 键名会保存下来uksort 排序是对键名进行的 <!doctype html> <html lang="en"> <head> <meta charset="utf-8&q
DOM改变字体大小周华华前端
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml&q
c3p0的配置 g21121 c3p0
c3p0是一个开源的JDBC连接池，它实现了数据源和JNDI绑定，支持JDBC3规范和JDBC2的标准扩展。c3p0的下载地址是：http://sourceforge.net/projects/c3p0/这里可以下载到c3p0最新版本。以在spring中配置dataSource为例：  <bean name="prope
Java获取工程路径的几种方法 510888780 java
第一种： File f = new File(this.getClass().getResource("/").getPath()); System.out.println(f); 结果: C:\Documents%20and%20Settings\Administrator\workspace\projectName\bin 获取当前类的所在工程路径; 如果不加“
在类Unix系统下实现SSH免密码登录服务器 Harry642 免密 ssh
1.客户机 (1)执行ssh-keygen -t rsa -C "[email protected]"生成公钥，xxx为自定义大email地址 (2)执行scp ~/.ssh/id_rsa.pub root@xxxxxxxxx:/tmp将公钥拷贝到服务器上，xxx为服务器地址 (3)执行cat
Java新手入门的30个基本概念一 aijuans java java 入门新手
在我们学习Java的过程中,掌握其中的基本概念对我们的学习无论是J2SE,J2EE,J2ME都是很重要的,J2SE是Java的基础,所以有必要对其中的基本概念做以归纳,以便大家在以后的学习过程中更好的理解java的精髓,在此我总结了30条基本的概念。　　Java概述:　　目前Java主要应用于中间件的开发(middleware)---处理客户机于服务器之间的通信技术,早期的实践证明,Java不适合
Memcached for windows 简单介绍 antlove java Web windows cache memcached
1. 安装memcached server a. 下载memcached-1.2.6-win32-bin.zip b. 解压缩，dos 窗口切换到 memcached.exe所在目录，运行memcached.exe -d install c.启动memcached Server,直接在dos窗口键入 net start "memcached Server&quo
数据库对象的视图和索引百合不是茶索引 oeacle数据库视图
视图视图是从一个表或视图导出的表，也可以是从多个表或视图导出的表。视图是一个虚表，数据库不对视图所对应的数据进行实际存储，只存储视图的定义，对视图的数据进行操作时,只能将字段定义为视图,不能将具体的数据定义为视图为什么oracle需要视图; &
Mockito(一) --入门篇 bijian1013 持续集成 mockito 单元测试
Mockito是一个针对Java的mocking框架，它与EasyMock和jMock很相似，但是通过在执行后校验什么已经被调用，它消除了对期望行为（expectations）的需要。其它的mocking库需要你在执行前记录期望行为（expectations），而这导致了丑陋的初始化代码。 &nb
精通Oracle10编程SQL(5)SQL函数 bijian1013 oracle 数据库 plsql
/* * SQL函数 */ --数字函数 --ABS(n):返回数字n的绝对值 declare v_abs number(6,2); begin v_abs:=abs(&no); dbms_output.put_line('绝对值：'||v_abs); end; --ACOS(n):返回数字n的反余弦值，输入值的范围是-1~1，输出值的单位为弧度
【Log4j一】Log4j总体介绍 bit1129 log4j
Log4j组件：Logger、Appender、Layout Log4j核心包含三个组件：logger、appender和layout。这三个组件协作提供日志功能：日志的输出目标日志的输出格式日志的输出级别(是否抑制日志的输出) logger继承特性 A logger is said to be an ancestor of anothe
Java IO笔记白糖_ java
public static void main(String[] args) throws IOException { //输入流 InputStream in = Test.class.getResourceAsStream("/test"); InputStreamReader isr = new InputStreamReader(in); Bu
Docker 监控 ronin47 docker监控
目前项目内部署了docker，于是涉及到关于监控的事情，参考一些经典实例以及一些自己的想法，总结一下思路。 1、关于监控的内容监控宿主机本身监控宿主机本身还是比较简单的，同其他服务器监控类似，对cpu、network、io、disk等做通用的检查，这里不再细说。额外的，因为是docker的
java-顺时针打印图形 bylijinnan java
一个画图程序要求打印出： 1.int i=5; 2.1 2 3 4 5 3.16 17 18 19 6 4.15 24 25 20 7 5.14 23 22 21 8 6.13 12 11 10 9 7. 8.int i=6 9.1 2 3 4 5 6 10.20 21 22 23 24 7 11.19
关于iReport汉化版强制使用英文的配置方法 Kai_Ge iReport汉化英文版
对于那些具有强迫症的工程师来说，软件汉化固然好用，但是汉化不完整却极为头疼，本方法针对iReport汉化不完整的情况，强制使用英文版，方法如下：在 iReport 安装路径下的 etc/ireport.conf 里增加红色部分启动参数，即可变为英文版。 # ${HOME} will be replaced by user home directory accordin
[并行计算]论宇宙的可计算性 comsci 并行计算
现在我们知道,一个涡旋系统具有并行计算能力.按照自然运动理论,这个系统也同时具有存储能力,同时具备计算和存储能力的系统,在某种条件下一般都会产生意识...... 那么,这种概念让我们推论出一个结论 &nb
用OpenGL实现无限循环的coverflow dai_lm android coverflow
网上找了很久，都是用Gallery实现的，效果不是很满意，结果发现这个用OpenGL实现的，稍微修改了一下源码，实现了无限循环功能源码地址： https://github.com/jackfengji/glcoverflow public class CoverFlowOpenGL extends GLSurfaceView implements GLSurfaceV
JAVA数据计算的几个解决方案1 datamachine java Hibernate 计算
老大丢过来的软件跑了10天，摸到点门道，正好跟以前攒的私房有关联，整理存档。 -----------------------------华丽的分割线------------------------------------- 数据计算层是指介于数据存储和应用程序之间，负责计算数据存储层的数据，并将计算结果返回应用程序的层次。J &nbs
简单的用户授权系统,利用给user表添加一个字段标识管理员的方式 dcj3sjt126com yii
怎么创建一个简单的(非 RBAC)用户授权系统通过查看论坛，我发现这是一个常见的问题，所以我决定写这篇文章。本文只包括授权系统.假设你已经知道怎么创建身份验证系统(登录)。数据库首先在 user 表创建一个新的字段(integer 类型),字段名 'accessLevel',它定义了用户的访问权限扩展 CWebUser 类在配置文件(一般为 protecte
未选之路 dcj3sjt126com 诗
作者:罗伯特*费罗斯特黄色的树林里分出两条路, 可惜我不能同时去涉足, 我在那路口久久伫立, 我向着一条路极目望去, 直到它消失在丛林深处. 但我却选了另外一条路, 它荒草萋萋,十分幽寂; 显得更诱人,更美丽, 虽然在这两条小路上, 都很少留下旅人的足迹. 那天清晨落叶满地, 两条路都未见脚印痕迹. 呵,留下一条路等改日再
Java处理15位身份证变18位蕃薯耀 18位身份证变15位 15位身份证变18位身份证转换
15位身份证变18位，18位身份证变15位 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 蕃薯耀 201
SpringMVC4零配置--应用上下文配置【AppConfig】 hanqunfeng springmvc4
从spring3.0开始，Spring将JavaConfig整合到核心模块，普通的POJO只需要标注@Configuration注解，就可以成为spring配置类，并通过在方法上标注@Bean注解的方式注入bean。 Xml配置和Java类配置对比如下： applicationContext-AppConfig.xml <!-- 激活自动代理功能参看：
Android中webview跟JAVASCRIPT中的交互 jackyrong JavaScript html android 脚本
在android的应用程序中,可以直接调用webview中的javascript代码,而webview中的javascript代码,也可以去调用ANDROID应用程序(也就是JAVA部分的代码).下面举例说明之: 1 JAVASCRIPT脚本调用android程序要在webview中,调用addJavascriptInterface(OBJ,int
8个最佳Web开发资源推荐 lampcy 编程 Web 程序员
Web开发对程序员来说是一项较为复杂的工作，程序员需要快速地满足用户需求。如今很多的在线资源可以给程序员提供帮助，比如指导手册、在线课程和一些参考资料，而且这些资源基本都是免费和适合初学者的。无论你是需要选择一门新的编程语言，或是了解最新的标准，还是需要从其他地方找到一些灵感，我们这里为你整理了一些很好的Web开发资源，帮助你更成功地进行Web开发。这里列出10个最佳Web开发资源，它们都是受
架构师之面试------jdk的hashMap实现 nannan408 HashMap
1.前言。如题。 2.详述。 (1)hashMap算法就是数组链表。数组存放的元素是键值对。jdk通过移位算法（其实也就是简单的加乘算法），如下代码来生成数组下标(生成后indexFor一下就成下标了）。 static int hash(int h) { h ^= (h >>> 20) ^ (h >>>
html禁止清除input文本输入缓存 Rainbow702 html 缓存 input 输入框 change
多数浏览器默认会缓存input的值，只有使用ctl+F5强制刷新的才可以清除缓存记录。如果不想让浏览器缓存input的值，有2种方法：方法一：在不想使用缓存的input中添加 autocomplete="off"; <input type="text" autocomplete="off" n
POJO和JavaBean的区别和联系 tjmljw POJO java beans
POJO 和JavaBean是我们常见的两个关键字，一般容易混淆，POJO全称是Plain Ordinary Java Object / Pure Old Java Object，中文可以翻译成：普通Java类，具有一部分getter/setter方法的那种类就可以称作POJO，但是JavaBean则比 POJO复杂很多， Java Bean 是可复用的组件，对 Java Bean 并没有严格的规
java中单例的五种写法 liuxiaoling java 单例
/** * 单例模式的五种写法： * 1、懒汉 * 2、恶汉 * 3、静态内部类 * 4、枚举 * 5、双重校验锁 */ /** * 五、双重校验锁，在当前的内存模型中无效 */ class LockSingleton { private volatile static LockSingleton singleton; pri