multi-scale net

1. A Real-Time Face Detector Based on an End-to-End CNN_2017

多尺度的方法能够解决小尺度的问题

In this paper, a real-time approach for face detection was proposed by utilizing a single end-to-end deep neural network withmulti-scale feature maps, multi-scale prior aspect ratios as well as confidence rectification.Multi-scale feature maps overcome the difficulties of detecting small face, and meanwhile, multiscale prior aspect ratios reduce the computing cost and the confidence rectification

2. A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection_2016

网络包括一个目标提取网络和一个检测网络
In the proposal sub-network, detection is performed at multiple output layers, so that receptive fields match objects of different scales. These complementary scale-specific detectors are combined to produce a strong multi-scale object detector.
浅的卷基层有小的感知域能够检测小目标，在目标检测中如何覆盖多种目标的尺度是一个很关键的问题。检测器通常是一个点输出，这个点输出通过学习图像的一个区域或者一个模板得到，那么这个模板要和目标空间对应。两种对应方法：

The first is to learn a single classifier and rescale the image multiple times, so that the classifier can match all possible object sizes.this strategy requires feature computation at multiple image scales. While it usually produces the most accurate detection, it tends to be very costly.在图像的不同尺度下计算特征然后组合起来，计算量大

multiple classifiers to a single input image avoids the repeated computation of feature maps and tends to be efficient. However, it requires an individual classifier for each object scale and usually fails to produce good detectors.

RPN网络的学习类似于第二种先通过一个尺度的图像输入，然后得到很多不同的模板(f),本文提出了(g)

本文方法

网络结构理解为将不同卷积层的特征图参与到Loss的训练，每一层的特征图计算一个loss在总的loss中给每一层的loss一个权重。输出维度为c+b相当于anchor的数量为1，anchor可以根据每一层的尺度进行设计

2.png

虽然经过上述过程可以实现检测但是结果不好，将分支的特征图直接送入ROIpooling之前进行解卷积能够提高速度调整图像分辨率。

检测

论文解读http://www.xzhewei.com/Note-%E7%AC%94%E8%AE%B0/Pedestrian-Detection/Note-A-unified-multi-scale-deep-convolutional-neural-network-for-fast-object-detection/

3. Pedestrian Detection by Using CNN Features with Skip Connection

用于行人检测的层间连接，在VGG16上训练每层通过RPN网络得到128个框，将每层进行ROIpooling，得到13*13的featuremap为了得到更多的环境信息然后通过卷积进行stride= 2，送入全连接。这样做不会增大很多计算量吗？

层间

不用l2而是用BN因为：1更快2更好传播3更准

4. MFR-CNN: Incorporating Multi-Scale Features and Global Information for Traffic Object Detection

The proposed model consists of two components: a region proposal network that generates candidate object regions and an object detection network that incorporates multi-scale features and global information, namely MFR-CNN.

结构

5. Salient object detection via multi-scale attention CNN_2018

Moreover, the information loss of downsampling operations of FCN-based models results in the loss of details of the final saliency map, such as edges of the saliency object. In this paper, we proposed a novel deep convolutional neural network (CNN) by introducing a spatial and channel-wise attention layer into a multi-scale encoder-decoder framework.