YEAR: Submitted on 3 Jan 2019 (v1), last revised 4 Dec 2019 (this version, v4)
FROM: ArXiv 2019
WHAT PROBLEM TO SOLVE: Existing surveys only include some of the GNNs and examine a limited number of works, thereby missing the most recent development of GNNs.
This paper makes notable contributions summarized as follows:
New taxonomy
Recurrent graph neural networks, Convolutional graph neural networks, Graph autoencoders, and Spatial-temporal graph neural networks.
Comprehensive review
Provide detailed descriptions on representative models, make the necessary comparison, and summarise the corresponding algorithms.
Abundant resources
Including state-of-the-art models, benchmark data sets, open-source codes, and practical applications.
Future directions
Model depth, scalability trade-off, heterogeneity, and dynamicity.
Taxonomy of GNNs
RecGNNs: Aim to learn node representations with recurrent neural architectures. They assume a node in a graph constantly exchanges information/message with its neighbors until a stable equilibrium is reached.
ConvGNNs: Generate a node v’s representation by aggregating its own features xv and neighbors’ features xu, where u ∈ N(v). Different from RecGNNs, ConvGNNs stack multiple graph convolutional layers to extract high-level node representations.
GAEs: Unsupervised learning frameworks which encode nodes/graphs into a latent vector space and reconstruct graph data from the encoded information. GAEs are used to learn network embeddings and graph generative distributions.
STGNNs: Consider spatial dependency and temporal dependency at the same time. Many current approaches integrate graph convolutions to capture spatial dependency with RNNs or CNNs to model the temporal dependency.
Level Tasks
Training Framework
Representative RecGNNs and ConvGNNs
O(m) if the graph adjacency matrix is sparse and is O(n^2) otherwise, O(n^3) due to some other operations.
Apply the same set of parameters recurrently over nodes in a graph to extract high-level node representations.
Instead of iterating node states with contractive constraints, ConvGNNs address the cyclic mutual dependencies architecturally using a fixed number of layers with different weights in each layer.
Spectral-based ConvGNNs
Graph convolution of the input signal x with a filter g ∈ R n R_n Rn is defined as:
If we denote a filter as g θ = d i a g ( U g T ) gθ= diag(U^T_g) gθ=diag(UgT), then the spectral graph convolution is simplified as:
Spectral-based ConvGNNs all follow this definition. The key difference lies in the choice of the filter g θ g_θ gθ.
Spatial-based ConvGNNs
The spatial-based graph convolutions convolve the central node’s representation with its neighbors’ representations to derive the updated representation for the central node. From another perspective, spatial-based ConvGNNs share the same idea of information propagation/message passing with RecGNNs. The spatial graph convolutional operation essentially propagates node information along edges.
Neural Network for Graphs (NN4G)
Contextual Graph Markov Model (CGMM)
Diffusion Convolutional Neural Network (DCNN)
Diffusion Graph Convolution (DGC)
Partition Graph Convolution (PGC)
Message Passing Neural Network (MPNN): Outlines a general framework of spatial-based ConvGNNs. It treats graph convolutions as a message passing process in which information can be passed from one node to another along edges directly. MPNN runs K-step message passing iterations to let information propagate further. The message passing function (namely the spatial graph convolution) is defined as:
MPNN can cover many existing GNNs by assuming different forms of U k ( ⋅ ) , M k ( ⋅ ) , a n d R ( ⋅ ) U_k(·), M_k(·), and R(·) Uk(⋅),Mk(⋅),andR(⋅).
Graph Isomorphism Network (GIN)
Graph Attention Network (GAT)
Gated Attention Network (GAAN)
Mixture Model Network (MoNet)
Large-scale Graph Convolutional Network (LGCN)
Improvement in terms of training efficiency
Training ConvGNNs such as GCN [22] usually is required to save the whole graph data and intermediate states of all nodes into memory. The full-batch training algorithm for ConvGNNs suffers significantly from the memory overflow problem, especially when a graph contains millions of nodes.
Comparison between spectral and spatial models
spatial models are preferred over spectral models due to efficiency, generality, and flexibility issues.
Graph Pooling Modules
Discussion of Theoretical Aspects
Shape of receptive field
As a result, a ConvGNN is able to extract global information by stacking local graph convolutional layers.
VC dimension
Graph isomorphism
common GNNs such as GCN and GraphSage are incapable of distinguishing different graph structures. If the aggregation functions and the readout functions of a GNN are injective, the GNN is at most as powerful as the WL test in distinguishing different graphs.
Equivariance and invariance
A GNN must be an equivariant function when performing node-level tasks and must be an invariant function when performing graph-level tasks.
Universal approximation
Graph autoencoders (GAEs) are deep neural architectures which map nodes into a latent feature space and decode graph information from latent representations.
Network Embedding
GAEs learn network embeddings using an encoder to extract network embeddings and using a decoder to enforce network embeddings to preserve the graph topological information such as the PPMI matrix and the adjacency matrix.
Graph Generation
With multiple graphs, GAEs are able to learn the generative distribution of graphs by encoding graphs into hidden representations and decoding a graph structure given hidden representations.
In brief, sequential approaches linearize graphs into sequences. They can lose structural information due to the presence of cycles. Global approaches produce a graph all at once. They are not scalable to large graphs as the output space of a GAE is up to O ( n 2 ) O(n^2) O(n2).
STGNNs capture spatial and temporal dependencies of a graph simultaneously. The task of STGNNs can be forecasting future node values or labels, or predicting spatial-temporal graph labels. STGNNs follow two directions, RNN-based methods and CNN-based methods.
Most RNN-based approaches capture spatial-temporal dependencies by filtering inputs and hidden states passed to a recurrent unit using graph convolutions.
RNN-based approaches suffer from time-consuming iterative propagation and gradient explosion/vanishing issues.
CNN-based approaches tackle spatial-temporal graphs in a non-recursive manner with the advantages of parallel computing, stable gradients, and low memory requirements.
CNN-based approaches interleave 1D-CNN layers with graph convolutional layers to learn temporal and spatial dependencies respectively.
Data Sets
Evaluation & Open-source Implementations
Node Classification: In node classification, most methods follow a standard split of train/valid/test on benchmark data sets including Cora, Citeseer, Pubmed, PPI, and Reddit. They reported the average accuracy or F1 score on the test data set over multiple runs.
Graph Classification: A double cv method, which uses an external k fold cv for model assessment and an inner k fold cv for model selection.
Open-source implementations
Practical Applications
Computer vision
Applications of GNNs in computer vision include scene graph generation, point clouds classification, and action recognition.
Natural language processing
A common application of GNNs in natural language processing is text classification.
GNNs utilize the inter-relations of documents or words to infer document labels.
Forecasting traffic speed, volume or the density of roads in traffic networks.
Another industrial-level application is taxi-demand prediction with historical taxi demands, location information, weather data, and event features.
Recommender systems
Graph-based recommender systems take items and users as nodes. By leveraging the relations between items and items, users and users, users and items, as well as content information, graph-based recommender systems are able to produce high-quality recommendations. The key to a recommender system is to score the importance of an item to a user. As a result, it can be cast as a link prediction problem.
In the field of chemistry, researchers apply GNNs to study the graph structure of molecules/compounds.
Program verification, program reasoning, social influence prediction, adversarial attacks prevention, electrical health records modeling, brain networks, event detection, and combinatorial optimization.
Model depth
The performance of a ConvGNN drops dramatically with an increase in the number of graph convolutional layers.
In theory, with an infinite number of graph convolutional layers, all nodes’ representations will converge to a single point. This raises the question of whether going deep is still a good strategy for learning graph data.
Scallability trade-off
The scalability of GNNs is gained at the price of corrupting graph completeness. Whether using sampling or clustering, a model will lose part of the graph information.
It is difficult to directly apply current GNNs to heterogeneous graphs, which may contain different types of nodes and edges, or different forms of node and edge inputs, such as images and text.
Graphs are in nature dynamic in a way that nodes or edges may appear or disappear, and that node/edge inputs may change time by time.