【代码阅读】PointNet++代码梳理

文章目录

  • SA层
  • FP层

本文为PointNet++ CUDA代码阅读系列的第一部分,其他详见:
(一)PointNet++代码梳理
(二)PointNet++中的FPS的CUDA实现
(三)PointNet++中ball query的CUDA实现
(四)PointNet++中的Three_nn的CUDA实现


PointNet++的核心操作是SA层和FP层,这里就来梳理一下SA层和FP层都干了什么。

SA层

参数:降采样点的数量(npoints),邻域半径(radii),邻域内点的数量(nsample),MLP
输入:xyz,features

计算过程如下:

new_xyz_idx = FPS(xyz, npoints)  #使用FPS选出降采样点的下标,记作new_xyz_idx
new_xyz = gather(xyz, new_xyz_idx)  #根据下标选出降采样的点,记作new_xyz
idx = ball_query(xyz, new_xyz, radii) #根据邻域半径radii,通过ball_query函数实现在xyz中属于new_xyz邻域内的点的下标,记作idx
grouped_xyz, grouped_feature = group(xyz, feature, idx) #根据下标,选取邻域内的xyz和相对应的feature
new_feature = torch.cat([grouped_xyz-new_xyz, grouped-feature]) # new_feature:(B, 3 + C, npoint, nsample)
new_feature = MLP(new_feature)  # new_feature:(B, C', npoint, nsample)
new_feature = max_pooling(new_feature, dim=3)  # new_feature:(B, C', npoint)

至此,SA层完成点的降采样和特征提取。那具体来看,其中有4个函数是用CUDA编写的,分别是FPS,gather,ball_query,group。在这四个函数中,FPS和ball_query是找下标,不对点和特征做操作,是无需传播梯度的。而gather和group则是根据下标从xyz中选点,这个其实用torch.gather也可以实现。

经过以上分析,比较重要的FPS和ball_query,另外两个就是选取点而已。

FP层

参数:MLP
输入:known,unknown,known_feature,unknown_feature,定义如下

    :param unknown: (B, n, 3) tensor of the xyz positions of the unknown features
    :param known: (B, m, 3) tensor of the xyz positions of the known features
    :param unknow_feats: (B, C1, n) tensor of the features to be propigated to
    :param known_feats: (B, C2, m) tensor of features to be propigated

简单来说,FP层是将global feature回传到原始点云的过程,所以know是靠近特征金字塔顶部的点,unknown是靠近特征金字塔底部的点,那known_feature是指known包含的特征,是global的feature,unknown_feature是unknow原本的特征是局部特征

计算过程如下:

dist, idx = three_nn(unknown, known)  # 对于unknown的每个点,找到其在known最近的3个点的距离和下标
weight = f(dist)  # 利用距离计算权重
interpolated_feats = three_interpolate(known_feats, idx, weight)
new_features = torch.cat([interpolated_feats, unknow_feats], dim=1) # (B, C2 + C1, n)
new_features = MLP(new_features) # (B, C2 + C1, n) 

看计算过程,其中three_nn和three_interpolate是用CUDA编写的。three_nn是计算距离和下标,不需要梯度。three_interpolate相当于是在known先用idx进行一次gather,然后再和weight加权平均。所以重要的就是three_nn如何实现。

你可能感兴趣的:(代码阅读)