Faster RCNN阅读笔记

Faster R-CNN阅读笔记

提出背景

proposals are the test-time computational bottleneck in state-of-the-art detection systems.
fast R-CNN在测试时几乎达到了实时的运行时间,所以候选框提取成了检测系统中的时间瓶颈。

主要创新点

computing proposals with a deep convolutional neural network—leads to an elegant and effective solution where proposal computation is nearly cost-free given the detection network’s computation.we introduce novel Region Proposal Networks (RPNs) that share convolutional layers with state-of-the-art object detection networks.
提出了一种基于卷积神经网络的区域建议网络,由于与检测网络共用了卷积层,所以实际增加的开销十分少。

Faster R-CNN结构

RPN+Fast R-CNN
The entire system is a single, unified network for object detection.
整个系统是一个单一且统一的检测模型。
RPN
A Region Proposal Network (RPN) takes an image (of any size) as input and outputs a set of rectangular object proposals, each with an objectness score.
RPN接受任意大小的输入并输出候选框以及是检测对象的置信度。
文章中一个比较重要的概念是anchor,也就是用于参考的候选框。简单来说就是对于输入网络的一张图片,经过多层卷积后得到大小为n*n的特征图,在特征图的每个点上都定义9个Anchor,最后按比例映射回输入图像上,就是参考候选框的位置。Faster RCNN阅读笔记_第1张图片

  • Loss Function
    We assign a positive label to two kinds of anchors: (i) the anchor/anchors with the highest Intersection-over- Union (IoU) overlap with a ground-truth box, or (ii) an anchor that has an IoU overlap higher than 0.7 with any ground-truth box.
    We assign a negative label to a non-positive anchor if its IoU ratio is lower than 0.3 for all ground-truth boxes.
    正样例:(1)与真实框的交并比最大 (2)与任意框的交并比大于0.7
    负样例: 与所有真实框的交并比都小于0.3
    其余舍弃

  • Training RPNs
    It is possible to optimize for the loss functions of all anchors, but this will bias towards negative samples as they are dominate. Instead, we randomly sample 256 anchors in an image to compute the loss function of a mini-batch, where the sampled positive and negative anchors have a ratio of up to 1:1. If there are fewer than 128 positive samples in an image, we pad the mini-batch with negative ones.
    对于每张图片,随机采样256个anchor来计算损失。其中正负比例为1:1。如果没有128个正样例,就用负样例代替。

  • Sharing Features for RPN and Fast R-CNN

你可能感兴趣的:(Convolutional,Neural,Networks)