Double Anchor R-CNN for Human Detection in a Crowd

titile	Double Anchor R-CNN for Human Detection in a Crowd
url	https://arxiv.org/pdf/1909.09998.pdf
动机	解决Human Detection in a Crowd情况下存在false positive和fase negtive。
内容	Double Anchor： 1、Double Anchor同时检测head和body，每个人的head和body自然耦合。 2、crossover策略为head和body产生高质量proposals。head和body特征被融合提高预测结果。Joint NMS算法，抑制false positive提高鲁棒性。 3、head的特征有助于区分instance，没有head detection的被认为false positive；NMS容易造成false negtive，可以通过head的overlap减少。 crowd occlusion难点： 1、在crowd场景中scales、ratios、poses变化范围较大，所以要保证鲁棒性。 2、当人与人overlap较大时，不同instance语义特征交织在一起，难以区分边界。(多个instance认为一个，两个instance的mistake) 3、即使区分正确，nms时由于overlap比较大也会出现false negtive的情况，如果调大阈值，会出现false positive。 Double Anchor RPN： 1、相同的anchor同时回归head和body，head-body branch以head anchor为基准，body-head branch以body anchor为基准，一个branch只有一个RPN classification预测前景或背景。 2、loss： 3、anchor与基础部分的gt overlap超过阈值为正样本（0.7）。 Proposal Crossover： 1、问题：attached part proposal较差，因为anchor和label基于principal part产生。 2、如果通过选择满足两部分IOU阈值的anchor作为正样本解决1中问题，会因为没有足够的positives而影响结果，受噪音影响较大。 3、Proposal Crossover：利用补充信息生成正样本。即body-head作为head-body的augmentation。（1）每个branch的pairs，principal part与gt overlap大于0.5，则pairs认为是positive（此时attached part是不准确的）（2）交换head-body branch和body-head branch的proposal：计算head-body branch的attached part与body-head的principal part之间overlap，如果超过阈值（0.5），head-body branch的body proposal将被具有最大overlap的body-head的body proposal替换。（3）crossover 方法产生 high-quality proposals for R-CNN。 Feature Aggregation： 1、融合body和head的feature：heads feature可以帮助区分instance in crowd，body feature的语义信息提供effective context，帮助head prediction。 2、Aggregating方法：combine the spatial feature maps or fully-connected (FC)，最终选择FC，避免head body的misalignment。 3、分类需要全局信息，定位需要局部分辨率，则分类特征是Aggregating后的FC向量。回归在各自feature上进行。 Joint NMS： 1、joint score更可靠，同时考虑body和head的score，在false positive情况下head score较低。 2、原NMS仅考虑一个branch。不抑制另一个branch的false positives。Joint NMS同时抑制两个分支的false positives。**
实验	Evaluation Metric： 1、Standard log-average miss rate (MR)：The MR is computed in the false positive per image (FPPI) with a range of [10−2 , 100 ] (MR−2) 2、AP50 Implementation Details： baseline：FPN with ResNet-50 model pre-trained on ImageNet Detection Results on CrowdHuman： Overall Performance： Ablation Study on Proposal Crossover： ∼40 positive pairs per image on average if proposals are sampled by requiring a threshold of 0.5 IoU for both body and head parts. the average number of positive proposal pairs after the crossover strategy increases to 97 per image. It proved that more qualified proposals are beneficial to detection performance. Ablation Study on Feature Aggregation： Ablation Study on Joint NMS： NMS：0.5 Results on COCOPersons and CrowdPose：
思考

Double Anchor R-CNN for Human Detection in a Crowd

你可能感兴趣的:(Double Anchor R-CNN for Human Detection in a Crowd)