论文阅读和分析:Graph Attention Networks

Graph Attention Networks




论文阅读和分析:Graph Attention Networks_第1张图片


左:模型采用的注意力机制 a ( W h i ⃗ , W h j ⃗ ) a(W\vec{h_i},W\vec{h_j}) a(Whi Whj ),由权重向量 a ⃗ ∈ R 2 F ‘ \vec{a}∈R^{2F^`} a R2F进行参数化,应用LeakyReLU激活函数。


论文阅读和分析:Graph Attention Networks_第2张图片


论文阅读和分析:Graph Attention Networks_第3张图片


torch_geometric.nn — pytorch_geometric documentation (pytorch-geometric.readthedocs.io)

The graph attentional operator from the “Graph Attention Networks” paper:
x i ′ = α i , i Θ x i + ∑ j ∈ N ( i ) α i , j Θ x j , \mathbf{x}^{\prime}_i = \alpha_{i,i}\mathbf{\Theta}\mathbf{x}_{i} + \sum_{j \in \mathcal{N}(i)} \alpha_{i,j}\mathbf{\Theta}\mathbf{x}_{j}, xi=αi,iΘxi+jN(i)αi,jΘxj,
where the attention coefficients a i j a_{ij} aij are computed as
α i , j = exp ⁡ ( L e a k y R e L U ( a ⊤ [ Θ x i   ∥   Θ x j ] ) ) ∑ k ∈ N ( i ) ∪ { i } exp ⁡ ( L e a k y R e L U ( a ⊤ [ Θ x i   ∥   Θ x k ] ) ) . \alpha_{i,j} = \frac{ \exp\left(\mathrm{LeakyReLU}\left(\mathbf{a}^{\top} [\mathbf{\Theta}\mathbf{x}_i \, \Vert \, \mathbf{\Theta}\mathbf{x}_j] \right)\right)} {\sum_{k \in \mathcal{N}(i) \cup \{ i \}} \exp\left(\mathrm{LeakyReLU}\left(\mathbf{a}^{\top} [\mathbf{\Theta}\mathbf{x}_i \, \Vert \, \mathbf{\Theta}\mathbf{x}_k] \right)\right)}. αi,j=kN(i){i}exp(LeakyReLU(a[ΘxiΘxk]))exp(LeakyReLU(a[ΘxiΘxj])).
If the graph has multi-dimensional edge features e i j e_{ij} eij, the attention coefficients a i j a_{ij} aijare computed as
α i , j = exp ⁡ ( L e a k y R e L U ( a ⊤ [ Θ x i   ∥   Θ x j   ∥   Θ e e i , j ] ) ) ∑ k ∈ N ( i ) ∪ { i } exp ⁡ ( L e a k y R e L U ( a ⊤ [ Θ x i   ∥   Θ x k   ∥   Θ e e i , k ] ) ) . \alpha_{i,j} = \frac{ \exp\left(\mathrm{LeakyReLU}\left(\mathbf{a}^{\top} [\mathbf{\Theta}\mathbf{x}_i \, \Vert \, \mathbf{\Theta}\mathbf{x}_j \, \Vert \, \mathbf{\Theta}_{e} \mathbf{e}_{i,j}]\right)\right)} {\sum_{k \in \mathcal{N}(i) \cup \{ i \}} \exp\left(\mathrm{LeakyReLU}\left(\mathbf{a}^{\top} [\mathbf{\Theta}\mathbf{x}_i \, \Vert \, \mathbf{\Theta}\mathbf{x}_k \, \Vert \, \mathbf{\Theta}_{e} \mathbf{e}_{i,k}]\right)\right)}. αi,j=kN(i){i}exp(LeakyReLU(a[ΘxiΘxkΘeei,k]))exp(LeakyReLU(a[ΘxiΘxjΘeei,j])).

  • in_channels (int or tuple) – Size of each input sample, or -1 to derive the size from the first input(s) to the forward method. A tuple corresponds to the sizes of source and target dimensionalities.

  • out_channels (int) – Size of each output sample.

  • heads (int, optional) – Number of multi-head-attentions. (default: 1)

  • concat (bool, optional) – If set to False, the multi-head attentions are averaged instead of concatenated. (default: True)

  • negative_slope (float, optional) – LeakyReLU angle of the negative slope. (default: 0.2)

  • dropout (float, optional) – Dropout probability of the normalized attention coefficients which exposes each node to a stochastically sampled neighborhood during training. (default: 0)

  • add_self_loops (bool, optional) – If set to False, will not add self-loops to the input graph. (default: True)

  • edge_dim (int, optional) – Edge feature dimensionality (in case there are any). (default: None)

  • fill_value (float or Tensor or str, optional) – The way to generate edge features of self-loops (in case edge_dim != None). If given as float or torch.Tensor, edge features of self-loops will be directly given by fill_value. If given as str, edge features of self-loops are computed by aggregating all features of edges that point to the specific node, according to a reduce operation. ("add", "mean", "min", "max", "mul"). (default: "mean")

  • bias (bool, optional) – If set to False, the layer will not learn an additive bias. (default: True)

  • **kwargs (optional) – Additional arguments of conv.MessagePassing.

        def __init__(
            in_channels: Union[int, Tuple[int, int]],
            out_channels: int,
            heads: int = 1,
            concat: bool = True,
            negative_slope: float = 0.2,
            dropout: float = 0.0,
            add_self_loops: bool = True,
            edge_dim: Optional[int] = None,
            fill_value: Union[float, Tensor, str] = 'mean',
            bias: bool = True,
        def forward(self, x: Union[Tensor, OptPairTensor], edge_index: Adj,
                    edge_attr: OptTensor = None, size: Size = None,

注释:文中提到的转导学习和归纳学习,transductive and inductive

(1条消息) 转导学习 transductive learning_TBYourHero的博客-CSDN博客_transductive







(1条消息) 转导学习 transductive learning_TBYourHero的博客-CSDN博客_transductive

torch_geometric.nn — pytorch_geometric documentation (pytorch-geometric.readthedocs.io)

[1710.10903] Graph Attention Networks (arxiv.org)
