【论文笔记】CVPR2020 Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition

使用GCN进行skeleton-based action recognition
【论文笔记】CVPR2020 Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition_第1张图片

Contribution

提出了两个设计:

  • a disentangled multi-scale aggregation scheme
  • a unified spatial-temporal graph convolutional module (G3D)

分别解决了两个问题:

  • unbiased weight problem: edge weights will be biased towards closer nodes against further nodes,对于距离较远的两个节点,他们之间的feature share的效果比较轻微,由于距离太远,weight很难传过去。学习long-range relationship比较困难。例如:scale = 7,真正到距离为7的节点的几率是很小的 (这里没有完全理解)。(原始的multi-scale GCN见paper Actional-Structural Graph Convolutional Networks for Skeleton-based Action Recognition
  • factorised spatial-temporal relationship learning: A typical approach is to extract spatial relationships at each time step and then model temporal dynamics. 这样,在spacetime的三维空间里不存在直接的信息流,只能是先space,再time这样间接的提取关系。

Methods

  • a disentangled multi-scale aggregation scheme

    设计k-scale adjacency matrix如下:
    【论文笔记】CVPR2020 Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition_第2张图片

    按如下进行GCN计算:
    【论文笔记】CVPR2020 Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition_第3张图片
    这样操作,不做high-order adjacency matrix,而是直接按照node的距离分成不同的scale。不同距离的node relationship can be treated equally.

  • a unified spatial-temporal graph convolutional module (G3D)

    take a temporal sliding window of size tao frames, 通过把sliding window里每个adjacency matrix按如下进行tile的方式,让每个node都和所有frames的对应邻居节点直接相连,我觉得这里好巧妙,可能是我见识少:
    【论文笔记】CVPR2020 Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition_第4张图片

    G3D见(b), Multi-scale G3D 见 © :
    【论文笔记】CVPR2020 Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition_第5张图片

Results

结果不错,尤其在新的120dataset上面,提高了非常多。我的问题是NTU RGB+D 120为什么只有一篇graph-based的文章33来对比?难道真的没有其他工作了么(我不了解此领域求指正
【论文笔记】CVPR2020 Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition_第6张图片

你可能感兴趣的:(论文笔记,人工智能,计算机视觉,深度学习)