python实现虚函数的方法是通过继承,定义子类必须重写的虚函数可以通过raise NotImplementedError来控制,如下:
def message_and_aggregate(
adj_t: Union[SparseTensor, Tensor],
) -> Tensor:
raise NotImplementedError
图的卷积操作一般可以分成两步:邻居聚合和消息传递 neighborhood aggregation or message passing scheme.
With x i ( k − 1 ) ∈ R F \mathbf{x}^{(k-1)}_i \in \mathbb{R}^F xi(k−1)∈RFdenoting node features of node i i i in layer ( k − 1 ) (k-1) (k−1)and e j , i ∈ R D \mathbf{e}_{j,i} \in \mathbb{R}^D ej,i∈RDdenoting (optional) edge features from node j j jto node i i i, message passing graph neural networks can be described as:
x i ( k ) = γ ( k ) ( x i ( k − 1 ) , □ j ∈ N ( i ) ϕ ( k ) ( x i ( k − 1 ) , x j ( k − 1 ) , e j , i ) ) , \mathbf{x}_i^{(k)} = \gamma^{(k)} \left( \mathbf{x}_i^{(k-1)}, \square_{j \in \mathcal{N}(i)} \, \phi^{(k)}\left(\mathbf{x}_i^{(k-1)}, \mathbf{x}_j^{(k-1)},\mathbf{e}_{j,i}\right) \right), xi(k)=γ(k)(xi(k−1),□j∈N(i)ϕ(k)(xi(k−1),xj(k−1),ej,i)),
where ◻ ◻ ◻ denotes a differentiable, permutation invariant function, e.g., sum, mean or max, and γ \gamma γ and ϕ \phi ϕ denote differentiable functions such as MLPs (Multi Layer Perceptrons).
构造消息到节点 i i i相当于对每条边 ( j , i ) ∈ ε (j,i)\in\varepsilon (j,i)∈ε如果flow=“source_to_target”,或者条边 ( i , j ) ∈ ε (i,j)\in\varepsilon (i,j)∈ε如果flow="target_to_source"进行 ϕ \phi ϕ运算。可以使用propagate传递的所有参数。并且tensors通过propagate()可以通过添加后缀 _ i , _ j \_i,\_j _i,_j被映射到各自的节点 i i i和节点 j j j。例如 x i , x j x_i,x_j xi,xj,通常 i i i是中心节点,而 j j j是邻居节点。
MessagePassing.update(aggr_out, …):
更新节点嵌入相当于对每个节点 i ∈ V i\in V i∈V进行 γ \gamma γ运算.将聚合的输出作为第一个参数和最初传递给propagate()的任何参数.
示例:The GCN layer is mathematically defined as
x i ( k ) = ∑ j ∈ N ( i ) ∪ { i } 1 deg ( i ) ⋅ deg ( j ) ⋅ ( W ⊤ ⋅ x j ( k − 1 ) ) + b , \mathbf{x}_i^{(k)} = \sum_{j \in \mathcal{N}(i) \cup \{ i \}} \frac{1}{\sqrt{\deg(i)} \cdot \sqrt{\deg(j)}} \cdot \left( \mathbf{W}^{\top} \cdot \mathbf{x}_j^{(k-1)} \right) + \mathbf{b}, xi(k)=j∈N(i)∪{i}∑deg(i)⋅deg(j)1⋅(W⊤⋅xj(k−1))+b,
ϕ ( k ) ( x i ( k − 1 ) , x j ( k − 1 ) , e j , i ) = 1 deg ( i ) ⋅ deg ( j ) ⋅ ( W ⊤ ⋅ x j ( k − 1 ) ) + b \phi^{(k)}\left(\mathbf{x}_i^{(k-1)}, \mathbf{x}_j^{(k-1)},\mathbf{e}_{j,i}\right)=\frac{1}{\sqrt{\deg(i)} \cdot \sqrt{\deg(j)}} \cdot \left( \mathbf{W}^{\top} \cdot \mathbf{x}_j^{(k-1)} \right) + \mathbf{b} ϕ(k)(xi(k−1),xj(k−1),ej,i)=deg(i)⋅deg(j)1⋅(W⊤⋅xj(k−1))+b
□ j ∈ N ( i ) = ∑ j ∈ N ( i ) ∪ { i } \square_{j \in \mathcal{N}(i)}=\sum_{j \in \mathcal{N}(i) \cup \{ i \}} □j∈N(i)=j∈N(i)∪{i}∑
γ ( k ) = D i r e c t m a p p i n g \gamma^{(k)}=Direct \ mapping γ(k)=Direct mapping
1、Add self-loops to the adjacency matrix.
2、Linearly transform node feature matrix.
3、Compute normalization coefficients.
4、Normalize node features
5、Sum up neighboring node features ("add" aggregation).Apply a final bias vector.
Steps 1-3 are typically computed before message passing takes place. Steps 4-5 can be easily processed using the MessagePassing base class. The full layer implementation is shown below:
import torch
from torch.nn import Linear, Parameter
from torch_geometric.nn import MessagePassing
from torch_geometric.utils import add_self_loops, degree
class GCNConv(MessagePassing):
def __init__(self, in_channels, out_channels):
super().__init__(aggr='add') # "Add" aggregation (Step 5).
# 设置可训练参数
self.lin = Linear(in_channels, out_channels, bias=False)#shape=(in_channels, out_channels)
self.bias = Parameter(torch.Tensor(out_channels))
def reset_parameters(self):
def forward(self, x, edge_index):
# x has shape [N, in_channels]
# edge_index has shape [2, E]
# Step 1: Add self-loops to the adjacency matrix.
edge_index, _ = add_self_loops(edge_index, num_nodes=x.size(0))
# Step 2: Linearly transform node feature matrix.
# 使用矩阵乘法代替求和操作,提高运算速度;
x = self.lin(x)
# Step 3: Compute normalization.
row, col = edge_index
# 计算度和开方
deg = degree(col, x.size(0), dtype=x.dtype)
deg_inv_sqrt = deg.pow(-0.5)
deg_inv_sqrt[deg_inv_sqrt == float('inf')] = 0
norm = deg_inv_sqrt[row] * deg_inv_sqrt[col]# 数组,节点和所有邻居的度的开方乘积;
# Step 4-5: Start propagating messages.
# 使用基类的聚合操作
out = self.propagate(edge_index, x=x, norm=norm)
# Step 6: Apply a final bias vector.
out += self.bias
return out
def message(self, x_j, norm):
# x_j has shape [E, out_channels]
# x_j i节点每条边的source node的特征,就是节点i的邻居节点的特征;
# Step 4: Normalize node features.使用矩阵乘法代替累加,提高运算速度
return norm.view(-1, 1) * x_j
示例2:Implementing the Edge Convolution
The edge convolutional layer processes graphs or point clouds and is mathematically defined as
x i ( k ) = max j ∈ N ( i ) h Θ ( x i ( k − 1 ) , x j ( k − 1 ) − x i ( k − 1 ) ) , \mathbf{x}_i^{(k)} = \max_{j \in \mathcal{N}(i)} h_{\mathbf{\Theta}} \left( \mathbf{x}_i^{(k-1)}, \mathbf{x}_j^{(k-1)} - \mathbf{x}_i^{(k-1)} \right), xi(k)=j∈N(i)maxhΘ(xi(k−1),xj(k−1)−xi(k−1)),
where h Θ h_{\mathbf{\Theta}} hΘ denotes an MLP.
ϕ ( k ) ( x i ( k − 1 ) , x j ( k − 1 ) , e j , i ) = h Θ ( x i ( k − 1 ) , x j ( k − 1 ) − x i ( k − 1 ) ) , \phi^{(k)}\left(\mathbf{x}_i^{(k-1)}, \mathbf{x}_j^{(k-1)},\mathbf{e}_{j,i}\right)=h_{\mathbf{\Theta}} \left( \mathbf{x}_i^{(k-1)}, \mathbf{x}_j^{(k-1)} - \mathbf{x}_i^{(k-1)} \right), ϕ(k)(xi(k−1),xj(k−1),ej,i)=hΘ(xi(k−1),xj(k−1)−xi(k−1)),
□ j ∈ N ( i ) = max j ∈ N ( i ) \square_{j \in \mathcal{N}(i)}= \max_{j \in \mathcal{N}(i)} □j∈N(i)=j∈N(i)max
γ ( k ) = D i r e c t m a p p i n g \gamma^{(k)}=Direct \ mapping γ(k)=Direct mapping
import torch
from torch.nn import Sequential as Seq, Linear, ReLU
from torch_geometric.nn import MessagePassing
class EdgeConv(MessagePassing):
def __init__(self, in_channels, out_channels):
super().__init__(aggr='max') # "Max" aggregation.
# 定义MLP层
self.mlp = Seq(Linear(2 * in_channels, out_channels),
Linear(out_channels, out_channels))
def forward(self, x, edge_index):
# x has shape [N, in_channels]
# edge_index has shape [2, E]
return self.propagate(edge_index, x=x)
def message(self, x_i, x_j):
# x_i has shape [E, in_channels]
# x_j has shape [E, in_channels]
tmp =[x_i, x_j - x_i], dim=1) # tmp has shape [E, 2 * in_channels]
return self.mlp(tmp)
Creating Message Passing Networks — pytorch_geometric documentation (
23 种设计模式详解(全23种)_鬼灭之刃的博客-CSDN博客_设计模式