最详细的目标检测SSD算法讲解
深度学习之目标检测 SSD的理解和细节分析
SSD算法—-模型结构的详解及Python源码分析
SSD框架详细解读(一)
关于SSD的实现,参考了https://github.com/balancap/SDC-Vehicle-Detection,其中阐述了实现的细节。
the SSD network used the concept of anchor boxes for object detection. The image below illustrates the concept: at several scales are pre-defined boxes with different sizes and ratios. The goal of SSD convolutional network is, for each of these anchor boxes, to detect if there is an object inside this box (or closely), and compute the offset between the object bounding box and the fixed anchor box.
In the case of SSD network, we use VGG16 as a based architecture: it provides high quality features at different scales, the former being then used as inputs for multibox modules in charge of computing the object type and coordinates for each anchor boxes. The architecture of the network we use is illustrated in the following TensorBoard graph. It follows the original SSD paper:
For instance, consider the 8x8 feature block described in the image above. At every coordinate in the grid, it defines 4 anchor boxes of different dimensions. The multibox module taking this feature Tensor as input will thus provide two output Tensors: a classification Tensor of shape 8x8x4xNClasses and an offset Tensor of shape 8x8x4x4, where in the latter, the last dimension stands for the 4 coordinates of every bounding box.
As a result, the global SSD network will provide a classification score and an offset for a total of 8732 anchor boxes. During training, we therefore try to minimize both errors: the classification error on every anchor box and the localization error when there is a positive match with a grountruth bounding box. We refer to the original SSD paper for the precise equations defining the loss function.
完整的模型图请点击:
https://github.com/bjzhao143/objectiveDetect/blob/master/models/ssd300_model.png
本人下载了https://github.com/HiKapok/SSD.TensorFlow中的代码和预训练模型,用于细粒度识别。
首先解释一下图中的结构,ABCDE分别为当时VGG项目组测试的不同的网络结构,对于不同的结构进行了效果上的比较,从中发现LRN(local response normalization)好像并没什么用,之后就在后面的结构中舍弃了。图中D和E分别为VGG16和VGG19。
VGG19中的“19”是怎么来的?
在模型文件中,VGG19把激活层也当作一层,因此具有43个layers:
#43 layers
VGG19_LAYERS = ('conv1_1', 'relu1_1', 'conv1_2', 'relu1_2', 'pool1',
'conv2_1', 'relu2_1', 'conv2_2', 'relu2_2', 'pool2',
'conv3_1', 'relu3_1', 'conv3_2', 'relu3_2', 'conv3_3', 'relu3_3', 'conv3_4', 'relu3_4', 'pool3',
'conv4_1', 'relu4_1', 'conv4_2', 'relu4_2', 'conv4_3', 'relu4_3', 'conv4_4', 'relu4_4', 'pool4',
'conv5_1', 'relu5_1', 'conv5_2', 'relu5_2', 'conv5_3', 'relu5_3', 'conv5_4', 'relu5_4', 'pool5',
'fc6', 'relu6',
'fc7', 'relu7',
'fc8', 'softmax',
)