3000类目标检测--R-FCN-3000 at 30fps: Decoupling Detection and Classification

R-FCN-3000 at 30fps: Decoupling Detection and Classification
Code will be made available

本文主要解决的问题是怎么实时检测3000类物体。主要思路就是将 object检测和物体分类 分离
我们提出的 R-FCN-3000 比 YOLO9000 高 18%,速度每秒 30帧。
对于几十类的物体实时检测已经发展的比较成熟了。但是在实际生活中,物体的类别达到几千种。 最近提出的 fully convolutional class of detectors 对于给定图像计算每个类别的 objectness score,它们使用有限的计算资源可以达到很高的精度。尽管 fully-convolutional representations 对诸如目标检测、实例分割、跟踪、关系检测等提供了一个有效的方法。但是它们需要一组特定滤波器 来学习每个类别的相关信息,require class-specific sets of filters for each class。
例如 R-FCN / Deformable-R-FCN requires 49/197 position-specific filters for each class
Retina-Net requires 9 filters for each class for each convolutional feature map

R-FCN-3000 最关键的地方就是将 objectness detection and classification 解耦,这样类别的增加不会增加定位步骤的计算量。
The key insight behind the proposed R-FCN-3000 architecture is to decouple objectness detection and classification of the detected object so that the computational requirements for localization remain constant as the number of classes increases
3000类目标检测--R-FCN-3000 at 30fps: Decoupling Detection and Classification_第1张图片

4.1. Weakly Supervised vs. Supervised?
半监督的效果要差于 监督学习方法,所以这里我们还是用有监督的训练方法。我们对 ImageNet database 里的图像进行标记,每个图像只有 1-2 个物体

We show that careful design choices with respect to the CNN architecture, loss function and training protocol can yield a large-scale detector trained
on the ImageNet classification set with significantly better accuracy compared to weakly supervised detectors

R-FCN-3000 主要思路如下
3000类目标检测--R-FCN-3000 at 30fps: Decoupling Detection and Classification_第2张图片

图示显示有两个流程,上面流程负责物体的有无,即提取有效候选区域,不管其具体的物体类别信息, super-class detector。

Super-class Discovery
这里我们首先从 the final layer of ResNet-101 提取 一个 2048-dimensional feature-vectors 表示一个类别的信息,对于 C 个类别 一共有 C 个 2048-dimensional feature-vectors,这个 C 个特征向量 applying K-means clustering,得到 K 个 super-class clusters, When K is 1, the super-class detector predicts objectness

3000类目标检测--R-FCN-3000 at 30fps: Decoupling Detection and Classification_第3张图片

