【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》

【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第1张图片

TMM-2019


文章目录

  • 1 Background and Motivation
  • 2 Related Work
  • 3 Advantages / Contributions
  • 4 WiderPerson Dataset
    • A. Data Collection
    • B. Annotation Tool
    • C. Image Annotation
    • D. Dataset Statistic
    • E. Benchmarking
  • 5 Provided Baseline Method
    • A. Improved Faster R-CNN
    • B. Vanilla RetinaNet
  • 6 Experiments
    • A. Model Analysis
    • B. Dataset Analysis
    • C. Generalization Capability
  • 7 Conclusion(own)


1 Background and Motivation

行人检测应用很广泛,security and surveillance, mobile robotics, autonomous driving, and crowd sourcing 等

行人检测效果受检测算法和数据影响,两者交替螺旋上升

there is a gap in the diversity and density between real world requirements and current pedestrian detection benchmarks

本文作者提出了 a large and diverse dataset named WiderPerson for dense pedestrian detection in the wild,29.87 annotations per image

【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第2张图片
和现有的一些行人检测数据集对比下
【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第3张图片

2 Related Work

  • Dataset
  • Method
    • Generic Object Detection
    • Pedestrian Detection

3 Advantages / Contributions

  • 提出行人数据集 WiderPerson dataset
  • 改进了 faster rcnn 目标检测器 to deal with large density and diversity variations
  • prove 该数据的 generalization capabilities(as pre-train)

4 WiderPerson Dataset

A. Data Collection

Google, Bing, and Baidu

超 50 个关键字 (e.g., ppedestrian, cyclist, walking, running, marathon, square dance and group photo)

∼50, 000 候选图片,筛掉后剩 13, 382

训练,验证,测试数量为 8, 000, 1, 000 and 4, 382

B. Annotation Tool

【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第4张图片

C. Image Annotation

【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第5张图片
标注头顶和脚的中心两个点,然后根据固定长宽比 w h = 0.41 \frac{w}{h} = 0.41 hw=0.41 生成矩形框标签

假人也会标注,例如 human on the posters, reflections, mannequin and statues, etc

标完后 three-fold cross-validation to check the annotations strictly.(三个人复查,一半以上认为有瑕疵就返工)

D. Dataset Statistic

1)Capacity
【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第6张图片
define three levels of difficulty

  • ‘Easy’ (≥ 100 pixels),
  • ‘Medium’ (≥ 50 pixels),
  • ‘Hard’ (≥ 20 pixels)

【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第7张图片
用 Edgebox 方法产生的 proposal 来试试召回率,反映不同难度数据之间的差异

2)Scale
【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第8张图片
横坐标 bbox 的高度像素数,纵坐标为频次

3)Density
【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第9张图片
4)Diversity

国家,城市,季节分布多样【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第10张图片
人物出现在图片中的位置多样

【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第11张图片
行人细粒度更高
【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第12张图片
5)Occlusion
【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第13张图片

E. Benchmarking

评价指标,MR,越小越好

average log miss rate over false positives per-image(FPPI) ranging in [ 1 0 − 2 , 1 0 0 ] [10^{−2}, 10^{0}] [102,100]

riders / partially-visible persons / crowd / ignore regions are ignored

5 Provided Baseline Method

A. Improved Faster R-CNN

11 different anchor-box scales and 1 aspect ratio (w/h = 0.41)

1)Finer Feature Map

删除了第四次 down-sampling operation,输出特征图 stride 16 改为了 stride 8

all layers before the fourth down-sampling operation are unchanged and all convolutional filters after it are modified by the “hole algorithm”

2)Ignore Region and Tiny Pedestrian Handling

  • Ignore regions might contain objects of a given class without precise localization.
  • online filter pedestrians whose height is less than 20 pixels after scaling during training

3)RoI Feature Enhancing

加了个 SE 通道注意力
【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第14张图片

4)Dynamic Sample Strategy

Faster RCNN 中,256 and 128 samples for RPN and Fast R-CNN with 1 : 1 and 1 : 3 positive-negative ratio

作者的数据集平均每张图上的目标数量约为 28.87,the fixed sample strategy will lead to inadequate use of training positive samples

作者的改进(dynamic sample strategy)

if there are too many positive samples, we determine the number of negative samples based on the above positive-negative ratio to ensure that all positive samples are used, otherwise we follow the original strategy.

B. Vanilla RetinaNet

【Focal Loss】《Focal Loss for Dense Object Detection》

6 Experiments

A. Model Analysis

【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第15张图片
prevent the sampling of background boxes in those ignored areas

B. Dataset Analysis

1)Detection Results

WiderPerson
【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第16张图片
Caltech-USA
【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第17张图片
WdierPerson ⇒ \Rightarrow Caltech-USA 表示在 WdierPerson 上预训练,Caltech-USA 上 fine-tune,双斜杠下面是 SOTA

CityPerson
【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第18张图片

看看 demo

【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第19张图片

2)Quantity Analysis

【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第20张图片
a log arithmic relation between the amount of training data and the performance of deep learning methods.

3)Quality Analysis

adding fine-grained annotations for riders is helpful for the pedestrian detection performance

4)Error Analysis

  • false negative:漏检
  • false positive:误检

【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第21张图片
LOC 和 BG 都指定是 false positives

LOC indicates the localization errors that occurs when a pedestrian is detected with a misaligned bounding box, and BG indicates that a background region is mistakenly detected as a pedestrian

C. Generalization Capability

pre-trained + fine-tune

1)Caltech
Table VII

2)CityPersons
Table VIII

7 Conclusion(own)

  • NMS,usually not trained but has a great influence on detection performance.

  • 降低 false positive
    如何让目标检测算法暂停误报?

【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第22张图片

【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第23张图片

【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第24张图片

【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第25张图片

【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第26张图片

【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第27张图片

【WiderPerson】《WiderPerson:A Diverse Dataset for Dense Pedestrian Detection in the Wild》_第28张图片

你可能感兴趣的:(CNN,人工智能,深度学习,计算机视觉)