单人的2D人体姿态估计一些小细节

1、我发现在单人的人体姿态估计网络里面并没有检测人(即human detector)的结构,因为单个人的人体姿态估计benchmarks(例如LSP\MPII\FLIC)都提供了目标人的位置(location)和尺寸(scale),因此假设在人体姿态估计之前检测已经完成。

2、这几天一直在想堆栈沙漏网络是如何既可以检测关节点位置而且也兼顾关节点之间的空间长距离关系,今天在文献《Self Adversarial Training for Human Pose Estimation》找到答案,We calculate the mean square error between them to enforce the generator to learn the image features that are important for localizing the keypoints. In early stacks, local evidence is used since the
receptive field is restricted to a small area. In later stacks, long-range spatial relationships will be considered since the receptive field has been enlarged through the numerous sequential convolutional operations.
翻译过来:在早期的堆栈中,由于接收字段仅限于一个小区域,因此使用了局部证据。在后面的堆栈中,由于接收场通过多次连续卷积运算被扩大,因此将考虑远程空间关系。

3、(Adversarial PoseNet: A Structure-aware Convolutional Network for Human Pose Estimation文章阐述这个问题)MPII验证集数中遮挡数量:In the validation set of MPII, about 25% of the elbows and wrists with annotations are labeled invisible.

4、这种高斯热图表示适用于随后的卷积运算,因为它与输入图像是逐像素对齐的。

你可能感兴趣的:(人体姿态估计)