1,文章构建了一个大规划的无标注行人重识别数据集LUPerson。
2,文章仔细研究无监督预训练模型的关键因素。
论文名称 | 简称 | 会议/期刊 | 出版年份 | baseline | backbone | 数据集 |
---|---|---|---|---|---|---|
Unsupervised Pre-training for Person Re-identification | - | CVPR | 2021 | - | LUPerson、market501、dukeMTMC-reID、CUHK03、MSMT17 |
在线链接:https://openaccess.thecvf.com/content/CVPR2021/html/Fu_Unsupervised_Pre-Training_for_Person_Re-Identification_CVPR_2021_paper.html
源码链接: -
1,we present a large scale unlabeled person re-identification (Re-ID) dataset “LUPerson” and make the first attempt ofperforming unsupervised pre-training for improving the generalization ability of the learned person Re-ID feature representation.
2,Based on LUPerson, we system- atically study the key factors for learning Re-ID features from two perspectives: data augmentation and contrastive loss. Unsupervised
Our results also show that the performance improvement is more significant on small-scale target datasets or under few-shot setting.
Figure 3: Illustration of the Momentum Contrast mecha- nism for contrastive learning in MoCo [19].
1,文章自己构建了一个数据集 LUPerson: Large-scale Re-ID Dataset。一个超大规模的行人重识别数据集,超过200K identities,由46个场景收集得到。
2, 文章数据在YouTube上收集,使用YOLO-v5 来训练MS-COCO进行修行人检测,使用HRnet来检测身体的关键点。
如果一个行人满足:a头和上身可视,b下身局部可视。 c高宽比在1.5到5之间,d检测置信度大于0.72,e bounding box宽度大于48px,则是有效的。
3, LUPerson和现有数据集的比较。
4, 使用广泛使用的对比学习方法 MoCoV2[9] 作为 baseline。通过两个增强分别得到查询样本q和正样本k+,构建对比损失如公式1所示。
5,文章设计的增强策略做了两点改变, 1是增加了 Ran- domErasing with high strength ,2是剔除了色彩抖动(color jitter)
6,文章对损失函数中的温度超参数做了研究,其最佳值为0.07
7,在迁移特征方面,现有方法延用了ImageNet的超参数,但作者认为这可能并不适应Re-ID,文章对参数进行了校准,在网络头图添加了3个额外的BN层。
1,文章的出发点在于构建了一个超大的person re-ID数据集。这也是Re-ID研究领域的巨大贡献。
2,文章在对比学习的具体实现细节上很用心,通过细致的实验验证,对数据增强策略做出来改进。 同时,对大家广泛使用的 温度参数的取值也提出了质疑,并通过实验求得最佳值。
3,在迁移特征方面 也比较细致地对原有模型参数进行校准,这一点也极具创新性。
小样本学习与智能前沿(下方↓公众号)后台回复“UPT",即可获得论文电子资源。
@inproceedings{DBLP:conf/cvpr/Fu0BYY0L021,
author = {Dengpan Fu and
Dongdong Chen and
Jianmin Bao and
Hao Yang and
Lu Yuan and
Lei Zhang and
Houqiang Li and
Dong Chen},
title = {Unsupervised Pre-Training for Person Re-Identification},
booktitle = {{CVPR}},
pages = {14750–14759},
publisher = {Computer Vision Foundation / {IEEE}},
year = {2021}
}
[1] Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Pi- otr Bojanowski, and Armand Joulin. Unsupervised learn- ing of visual features by contrasting cluster assignments. In Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS), 2020. 2
[2] Binghui Chen, Weihong Deng, and Jiani Hu. Mixed high- order attention network for person re-identification. In Pro- ceedings ofthe IEEE International Conference on Computer Vision, pages 371–381, 2019. 8
[3] Dapeng Chen, Dan Xu, Hongsheng Li, Nicu Sebe, and Xiao- gangWang. Group consistent similarity learning via deep crf for person re-identification. In Proceedings ofthe IEEE Con- ference on Computer Vision and Pattern Recognition, pages 8649–8658, 2018. 1
[4] Guangyi Chen, Chunze Lin, Liangliang Ren, Jiwen Lu, and Jie Zhou. Self-critical attention learning for person re- identification. In Proceedings of the IEEE International Conference on Computer Vision, pages 9637–9646, 2019. 8
[5] Tianlong Chen, Shaojin Ding, Jingyi Xie, Ye Yuan, Wuyang Chen, Yang Yang, Zhou Ren, and Zhangyang Wang. Abd- net: Attentive but diverse person re-identification. In Pro- ceedings ofthe IEEE International Conference on Computer Vision, pages 8351–8361, 2019. 8
[6] Ting Chen, Simon Kornblith, Mohammad Norouzi, and Ge- offrey Hinton. A simple framework for contrastive learning of visual representations. arXiv preprint arXiv:2002.05709, 2020. 1, 2, 5
[7] Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi, and Geoffrey Hinton. Big self-supervised mod- els are strong semi-supervised learners. arXiv preprint arXiv:2006.10029, 2020. 2
[8] Xinlei Chen, Haoqi Fan, Ross Girshick, and Kaiming He. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297, 2020. 1, 2
[9] Xinlei Chen, Haoqi Fan, Ross Girshick, and Kaiming He. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297, 2020. 2, 4, 5
[10] Zuozhuo Dai, Mingqiang Chen, Xiaodong Gu, Siyu Zhu, and Ping Tan. Batch dropblock network for person re- identification and beyond. In Proceedings ofthe IEEE Inter- national Conference on Computer Vision, pages 3691–3701, 2019. 2, 8
[11] Piotr Doll´ar, Ron Appel, Serge Belongie, and Pietro Per- ona. Fast feature pyramids for object detection. IEEE transactions on pattern analysis and machine intelligence, 36(8):1532–1545, 2014. 4
[12] Pedro F Felzenszwalb, Ross B Girshick, David McAllester, and Deva Ramanan. Object detection with discriminatively trained part-based models. IEEE transactions on pattern analysis and machine intelligence, 32(9):1627–1645, 2009. 4
[13] Dengpan Fu, Bo Xin, Jingdong Wang, Dongdong Chen, Jianmin Bao, Gang Hua, and Houqiang Li. Improving person re-identification with iterative impression aggregation. IEEE Transactions on Image Processing, 29:9559–9571, 2020. 1
[14] Yixiao Ge, Dapeng Chen, and Hongsheng Li. Mutual mean- teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification. In International Con- ference on Learning Representations, 2019. 2, 8
[15] Yixiao Ge, Feng Zhu, Dapeng Chen, Rui Zhao, and Hong- sheng Li. Self-paced contrastive learning with hybrid mem- ory for domain adaptive object re-id. In Advances in Neural Information Processing Systems, 2020. 2, 8
[16] Douglas Gray and Hai Tao. Viewpoint invariant pedes- trian recognition with an ensemble of localized features. In European conference on computer vision, pages 262–275. Springer, 2008. 3, 4
[17] Jean-Bastien Grill, Florian Strub, Florent Altch´e, Corentin Tallec, Pierre H Richemond, Elena Buchatskaya, Carl Do- ersch, Bernardo Avila Pires, Zhaohan Daniel Guo, Moham- mad Gheshlaghi Azar, et al. Bootstrap your own latent: A new approach to self-supervised learning. arXiv preprint arXiv:2006.07733, 2020. 2
[18] Jianyuan Guo, Yuhui Yuan, Lang Huang, Chao Zhang, Jin- Ge Yao, and Kai Han. Beyond human parts: Dual part- aligned representations for person re-identification. In Pro- ceedings ofthe IEEE International Conference on Computer Vision, pages 3642–3651, 2019. 8
[19] Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum contrast for unsupervised visual rep- resentation learning. In Proceedings ofthe IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 9729–9738, 2020. 1, 2, 4, 5, 6
[20] Lingxiao He, Xingyu Liao, Wu Liu, Xinchen Liu, Peng Cheng, and Tao Mei. Fastreid: A pytorch toolbox for general instance re-identification. arXiv preprint arXiv:2006.02631, 2020. 6
[21] Lingxiao He and Wu Liu. Guided saliency feature learning for person re-identification in crowded scenes. In European Conference on Computer Vision, pages 357–373. Springer, 2020. 8
[22] Alexander Hermans, Lucas Beyer, and Bastian Leibe. In de- fense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737, 2017. 1, 2, 6, 7
[23] Ruibing Hou, Bingpeng Ma, Hong Chang, Xinqian Gu, Shiguang Shan, and Xilin Chen. Interaction-and-aggregation network for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recogni- tion, pages 9317–9326, 2019. 8
[24] Xin Jin, Cuiling Lan, Wenjun Zeng, Guoqiang Wei, and Zhibo Chen. Semantics-aligned representation learning for person re-identification. In AAAI, pages 11173–11180, 2020. 8
[25] Srikrishna Karanam, Mengran Gou, Ziyan Wu, Angels Rates-Borras, Octavia Camps, and Richard J Radke. A comprehensive evaluation and benchmark for person re- identification: Features, metrics, and datasets. arXiv preprint arXiv:1605.09653, 2(3):5, 2016. 1, 3, 4
[26] Wei Li, Rui Zhao, Tong Xiao, and Xiaogang Wang. Deep- reid: Deep filter pairing neural network for person re- identification. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 152–159, 2014. 3, 4, 5
[27] Yutian Lin, Xuanyi Dong, Liang Zheng, Yan Yan, and Yi Yang. A bottom-up clustering approach to unsupervised per- son re-identification. In Proceedings of the AAAI Confer- ence on Artificial Intelligence, volume 33, pages 8738–8745, 2019. 2
[28] Chen Change Loy, Chunxiao Liu, and Shaogang Gong. Per- son re-identification by manifold ranking. In 2013 IEEE In- ternational Conference on Image Processing, pages 3567– 3571. IEEE, 2013. 3, 4
[29] Hao Luo, Wei Jiang, Youzhi Gu, Fuxu Liu, Xingyu Liao, Shenqi Lai, and Jianyang Gu. A strong baseline and batch normalization neck for deep person re-identification. IEEE Transactions on Multimedia, 2019. 8
[30] Hyunjong Park and Bumsub Ham. Relation network for per- son re-identification. In Proceedings ofthe AAAI Conference on Artificial Intelligence, volume 34, pages 11839–11847, 2020. 8
[31] Ruijie Quan, Xuanyi Dong, Yu Wu, Linchao Zhu, and Yi Yang. Auto-reid: Searching for a part-aware convnet for person re-identification. In Proceedings of the IEEE Inter- national Conference on Computer Vision, pages 3750–3759, 2019. 8
[32] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information pro- cessing systems, pages 91–99, 2015. 4
[33] Yantao Shen, Hongsheng Li, Shuai Yi, Dapeng Chen, and Xiaogang Wang. Person re-identification with deep similarity-guided graph neural network. In Proceedings of the European conference on computer vision (ECCV), pages 486–504, 2018. 2
[34] Yumin Suh, Jingdong Wang, Siyu Tang, Tao Mei, and Ky- oung Mu Lee. Part-aligned bilinear representations for per- son re-identification. In Proceedings of the European Con- ference on Computer Vision (ECCV), pages 402–419, 2018. 1, 2
[35] Ke Sun, Bin Xiao, Dong Liu, and Jingdong Wang. Deep high-resolution representation learning for human pose esti- mation. In CVPR, 2019. 3
[36] Yifan Sun, Liang Zheng, Yi Yang, Qi Tian, and Shengjin Wang. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In ECCV, 2018. 2, 8
[37] DongkaiWang and Shiliang Zhang. Unsupervised person re- identification via multi-label classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10981–10990, 2020. 2, 8
[38] GuanshuoWang, Yufeng Yuan, Xiong Chen, Jiwei Li, and Xi Zhou. Learning discriminative features with multiple granu- larities for person re-identification. In 2018 ACM Multime- dia Conference on Multimedia Conference, pages 274–282. ACM, 2018. 1, 2, 6, 7, 8
[39] Longhui Wei, Shiliang Zhang, Wen Gao, and Qi Tian. Person transfer gan to bridge domain gap for person re- identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 79–88, 2018. 1, 2, 3, 4
[40] Zhirong Wu, Yuanjun Xiong, Stella X Yu, and Dahua Lin. Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3733– 3742, 2018. 2, 5
[41] Bryan Ning Xia, Yuan Gong, Yizhe Zhang, and Christian Poellabauer. Second-order non-local attention networks for person re-identification. In Proceedings of the IEEE Inter- national Conference on Computer Vision, pages 3760–3769, 2019. 8
[42] Qizhe Xie, Zihang Dai, Eduard Hovy, Minh-Thang Luong, and Quoc V Le. Unsupervised data augmentation for con- sistency training. arXiv preprint arXiv:1904.12848, 2019. 5
[43] Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, and Zhibo Chen. Densely semantically aligned person re-identification. In Proceedings ofthe IEEE Conference on Computer Vision and Pattern Recognition, pages 667–676, 2019. 8
[44] Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jing- dong Wang, and Qi Tian. Scalable person re-identification: A benchmark. 2015 IEEE International Conference on Com- puter Vision (ICCV), pages 1116–1124, 2015. 2, 3, 4
[45] Liang Zheng, Hengheng Zhang, Shaoyan Sun, Manmohan Chandraker, Yi Yang, and Qi Tian. Person re-identification in the wild. In Proceedings ofthe IEEE Conference on Com- puter Vision and Pattern Recognition, pages 1367–1376, 2017. 1, 2, 6, 7
[46] Zhedong Zheng, Xiaodong Yang, Zhiding Yu, Liang Zheng, Yi Yang, and Jan Kautz. Joint discriminative and genera- tive learning for person re-identification. In Proceedings of the IEEE conference on computer vision and pattern recog- nition, pages 2138–2147, 2019. 8
[47] Zhedong Zheng, Liang Zheng, and Yi Yang. Unlabeled sam- ples generated by gan improve the person re-identification baseline in vitro. In Proceedings of the IEEE International Conference on Computer Vision, 2017. 2, 3, 4
[48] Zhun Zhong, Liang Zheng, Donglin Cao, and Shaozi Li. Re- ranking person re-identification with k-reciprocal encoding. In Proceedings ofthe IEEE Conference on Computer Vision and Pattern Recognition, pages 1318–1327, 2017. 6, 8
[49] Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, and Yi Yang. Random erasing data augmentation. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2020. 2, 5
[50] Kaiyang Zhou, Yongxin Yang, Andrea Cavallaro, and Tao Xiang. Omni-scale feature learning for person re- identification. In Proceedings of the IEEE International Conference on Computer Vision, pages 3702–3712, 2019. 8
[51] Kuan Zhu, Haiyun Guo, Zhiwei Liu, Ming Tang, and Jinqiao Wang. Identity-guided human semantic parsing for person re-identification. ECCV, 2020. 8