[论文精读]Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks

论文原文:Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks | IEEE Journals & Magazine | IEEE Xplore

目录

1. 省流版

1.1. 心得

1.2. 论文框架图

2. 论文逐段精读

2.1. Abstract

2.2. Introduction

2.3. Related Work

2.3.1. Handcrafted Methods

2.3.2. GNN-Based Methods

2.4. Proposed Method

2.4.1. Local ROI-GNN

2.4.2. Global Subject-GNN

2.5. Materials and Classification Evaluation

2.5.1. Materials

2.5.2. Experimental Setting

2.6. Experimental Results

2.6.1. Classification Results on Different Datasets

2.6.2. Ablation Studies

2.6.3. Biomarker Detection

2.6.4. Discussion

2.7. Conclusion

3. Reference List


1. 省流版

1.1. 心得

(1)他消融实验居然替换别的模块消,好勇啊...不过这样感觉确实能说明它们模型挺牛掰hh

1.2. 论文框架图

[论文精读]Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks_第1张图片

2. 论文逐段精读

2.1. Abstract

        ①The authors emphasize the current fMRI classification models ignores non-imaging information and relationships between subjects or misidentify brain regions or biomarkers

        ②Then, they put forward a local-to-global graph neural network (LG-GNN) to solve these problems

2.2. Introduction

        ①The diagnosis of Autism Spectrum Disorder (ASD) and Alzheimer’s disease (AD) is limited and inexperienced

        ②A Harvard-Oxford atlas maps of brain

[论文精读]Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks_第2张图片

        ③Graph neural networks (GNNs) has been found suitable for brain network analysis

        ④There are two types of GNN: regional brain graph and subject graph. The first one is good at local brain regions and biomarkers analysis, but ignores age, gender or relationships between subjects. And subject graph is just the opposite

        ⑤Therefore, they combine two methods to combine their advantages. They first adopt local, then expand to global

        ⑥Contribution: a) end to end LG-GNN, b) a pooling strategy based on an attention mechanism, c) excellent performances in 2 datasets

etiological  adj.病因;病原学的,病原学的

aberrant  adj.异常的;反常的;违反常规的

2.3. Related Work

2.3.1. Handcrafted Methods

        ①Functional connection construction by calculating Pearson correlation coefficient or "extracting weighted local clustering coefficients from the brain connectivity network from rs-fMRI, and then employed multiple-kernel based-SVM algorithm for the subsequent MCI classification"

        ②"Using SVM to classify AD from normal control (NC) after PCA and Student’s t-test for dimension reduction based on shape and diffusion tensor imaging"

         ③Also, different ROI may impact on the prediction accuracy

2.3.2. GNN-Based Methods

(1)GNNs Based on Regional Brain Graphs

        Models such as DS-GCNs, s-GCN, MVS-GCN and mutual multi-scale triplet GCN etc. adopt regional graphs

(2)GNNs Based on Subject Graphs

        Models such as GCN, InceptionGCN and SAC-GCN etc. adopt subject graphs

2.4. Proposed Method

        ①Framework of LG-GNN:

[论文精读]Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks_第3张图片

2.4.1. Local ROI-GNN

(1)Regional Brain Graph Construction

        Approach of graph G_{local}=\{\mathbf{V},\mathbf{A}\} construction, each graph is represented as \mathbf{X}=\left[x_1,\ldots,x_n\right]^\top:

\left.\mathbf{A}_{ij}=\left\{\begin{array}{ll}\frac{Cov(v_i,v_j)}{\sigma_{v_i}\sigma_{v_j}},&\text{if }v_i\text{and }v_j\text{are adjacent},\\1,&\text{if }i=j,\\0,&\text{otherwise}.\end{array}\right.\right.

(2)Local ROI-GNN Model

        ①Local ROI-GNN framework:

[论文精读]Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks_第4张图片

it consists of three graph convolution (GC) layers and the Self-Attention Based Pooling (SABP) module

        ②The activation in GC layer:

\begin{aligned} \mathbf{X}^{(t)}=& \mathrm{ReLU}(\mathrm{GC}(\mathbf{X},\mathbf{A}) \\ =& \operatorname{ReLU}\left(\mathbf{D}^{-\frac12}\mathbf{AD}^{-\frac12}\mathbf{X}^{(t-1)}\mathbf{W}^{(t)}\right) \end{aligned}

where \mathbf{D}=\mathrm{diag}\left(\sum_j\mathbf{A}_{1j},\sum_j\mathbf{A}_{2j},\ldots,\sum_j\mathbf{A}_{nj}\right) denotes the degree matrix of \mathbf{A};

\mathbf{W}^{(t)} represents trainable weight matrix in the t-th layer;

\mathbf{X}^{(t)}=\left[x_1^{(t)},x_2^{(t)},\ldots,x_n^{(t)}\right]^\top is node representation.

        ③Receptive field: one hop neighborhoods

        ④For getting two-hops information, they adopt 2 GC layers. Then they get \mathbf{X}^{(2)}\in\mathbb{R}^{n\times c}

(3)Self-Attention Based Pooling

        ①Pooling is essential in preserving and highlighting important ROI

        ②Transform \mathbf{X}^{(2)}\in\mathbb{R}^{n\times c} to attention score \textbf{z} by \mathbf{z}=\mathrm{GC}^{\prime}(\mathbf{X}^{(2)},\mathbf{A})

        ③Through \tilde{\mathbf{z}}=\tanh(\mathbf{z}), selecting the top-k nodes \mathrm{idx=}\mathrm{topk}(\tilde{\mathbf{z}},k)

        ④Then, follows a pooling layer (?\hat{\mathbf{X}}^{(2)}=\mathbf{X}^{(2)}(\mathrm{idx},:) and a \tilde{\mathbf{X}}^{(2)}=\hat{\mathbf{X}}^{(2)}\odot\tilde{\mathbf{z}}

        ⑤Finally, obtain new adjacency matrix \widetilde{\mathbf{A}}=\mathbf{A}(\mathrm{idx},\mathrm{idx})

        ⑥They designed a loss to separate ROI weights

\begin{gathered} \mathcal{L}_{MI}= \frac1n\sum_{i=1}^n\mathrm{MLP}(\mathrm{Concat}(\mathrm{GC}(\mathbf{X}^{(2)},\mathbf{A}),\mathbf{X}^{(2)}) \\ -\log\frac1n\sum_{i=1}^ne^{\mathrm{MLP}(\mathrm{Concat}(\overline{\mathbf{X}},\mathbf{X}^{(2)}))} \end{gathered}m

where \overline{\mathbf{X}} is randomly shuffled \mathbf{X} in channels (我也不知道有什么用啊).

        ⑦The output will be:

\mathbf{Y}=\mathrm{GC}(\tilde{\mathbf{X}}^{(2)},\tilde{\mathbf{A}})

where \mathbf{Y}=[y_1,y_2,\ldots,y_k]^{​{\top}}

        ⑧我很想知道为什么作者说SABP可以考虑到拓扑关系啊,脑图不是无向图吗而且也不会有什么先后关系吧??

2.4.2. Global Subject-GNN

(1)Subject Graph Construction

        ①There are m subjects, i.e. m vector \textbf{V}{}'

        ②\textbf{A}{}'=\textbf{C*W} is the adjacency matrix of \textbf{V}{}', where \textbf{C} is a binarized connectivity matrix of the combination of non-image information and information (with values greater than 0.4 becoming 1, and values less than 0.4 becoming 0)

        ③There is a similarity matrix \mathbf{S}_1=\exp\left(-\frac{\left[\rho(\mathbf{Y}_i,\mathbf{Y}_j)\right]^2}{2\sigma^2}\right), where \rho denotes the correlation distance, \sigma is mean of [\rho(\mathbf{Y}_i,\mathbf{Y}_j)]^2

        ④Node similarity metric \textbf{S}_{2} is constructed by non-image information

        ⑤\mathbf{C}^{\prime}=\mathbf{S}_1\odot \mathbf{S}_2

        ⑥Weight matrix \mathbf{W}_{ij}=\frac{\mathrm{Sim}(\mathrm{MLP}(\eta_i),\mathrm{MLP}(\eta_j))+1}2, where \eta _{i} and \eta _{j} are both non-image information, Sim reperesents the cosine similarity, the 2 MLP layers share the same weights

(2)Global Subject-GNN Model

        ①A multi-scaled residual model was proposed as:

[论文精读]Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks_第5张图片

        ②Cheb conv is:

\mathbf{g}_\theta\star h=\mathbf{U}\mathbf{g}_\theta(\mathbf{\Lambda})\mathbf{U}^\top h

where h\in\mathbb{R}^m looks like the eigenvalues of each node;

\mathbf{g}_\theta=\operatorname{diag}(\theta) denotes a filter;

\theta\in\mathbb{R}^m denotes parameter;

\star denotes convolution operator;

\textbf{U} is come from Laplace matrix \mathbf{L}=\mathbf{I}-\mathbf{D}^{-\frac12}\mathbf{A}\mathbf{D}^{-\frac12}=\mathbf{U}\mathbf{\Lambda}\mathbf{U}^\top;

        ③To save the calculating time, they approximate the above function to:

\mathbf{g}_\theta\star h\approx\sum_{k=0}^{K-1}\theta_kT_k(\tilde{\mathbf{L}})h

where \tilde{\mathbf{L}} is a rescaled Laplace graph, \theta_k denotes a learnable parameter

        ④The recursion might be T_k(\tilde{\mathbf{L}})=2\tilde{\mathbf{L}}T_{k-1}(\tilde{\mathbf{L}})-T_{k-2}(\tilde{\mathbf{L}})T_0(\tilde{\mathbf{L}})=1T_1(\tilde{\mathbf{L}})=\tilde{\mathbf{L}}

        ⑤Given the \mathbf{H}=\left[Y_1,Y_2,\ldots,Y_m\right]^\top, there is \mathbf{H}^{(l+1)}=\sum_{k=0}^{K-1}\theta_k^{(l)}T_k(\tilde{\mathbf{L}})\mathbf{H}^{(l)}

        ⑥The final embedding \mathbf{Z}=\sum_lw^{(l)}\odot\mathbf{H}^{(l)},

where w is learnable weight w^{(l)}=\text{Softmax}(r^{(l)})=\frac{\exp(r^{(l)})}{\sum_l\exp(r^{(l)})};

r denotes random initialized learnable weight;

(3)Total Training Loss

        ①They adopt cross entropy loss as the global loss:

\mathcal{L}_{CE}=\frac1N\sum_i-[q_i\cdot\log(p_i)+(1-q_i)\cdot\log(1-p_i)]

        ②They combine local \mathcal{L}_{MI} and global \mathcal{L}_{CE} to get total loss:

\mathcal{L}_{total}=\mathcal{L}_{CE}+\lambda\mathcal{L}_{MI}

where the hyper parameter \lambda is set as 0.1

2.5. Materials and Classification Evaluation

2.5.1. Materials

        They choose two datasets ABIDE, ADNI in 4 tasks:

[论文精读]Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks_第6张图片

(1)ABIDE Dataset

        ①They only select 871 samples from 1112 subjects with 403 ASD and 468 NC in 20 different sites.

        ②Preprocess pipeline: C-PAC

        ③Space: normalized in MNI152 space

(2)ADNI Dataset (NC and MCI)

        ①Preprocess: standard protocol

        ②Excluding significant artifacts or head movements beyond 2mm amplitude

        ③They choose 134 subjects with 96 MCI and 40 AD (不是,我有点不太理解这句话,40+96也不等于134吧?

(3)ANDI Dataset (pMCI and sMCI)

        ①pMCI patients: deteriorating within 36 months; sMCI patients: do not deteriorate

        ②Atlas: Harvard-Oxford

        ③They choose 41 pMCI and 80 sMCI subjects excluding NaN data

        ④Preprocessing: standard procedures on GRETNA toolkit

2.5.2. Experimental Setting

        ①Optimizer: Adam

(1)Parameters of ABIDE

        ①Dropout rate: 0.3

        ②Learning rate: 0.01

        ③Maximum epochs: 400

        ④Non-image data: acquisition site, gender

(2)Parameters of ANDI

        ①Dropout rate:0.3

        ②Learning rate: 0.01

        ③Maximum epochs: 300

        ④Non-image data: gender and age

        ②Chebyshev polynomial order K: 3

        ③Cross-validation: 10-fold, 9 for training and 1 for test

        ④Evaluation metrics: classification accuracy (Acc), area under the curve (AUC), sensitivity (Sen) and F1-score

        ⑤这里的ROI Dimension是什么东西?为什么是2000,2140之类的?Non-imaging dimension denotes (number of subjects, binary classification)

2.6. Experimental Results

2.6.1. Classification Results on Different Datasets

(1)Classification Results on ABIDE

        ①Comparison with handcrafted methods: Ridge Classifier, SVM, Random Forest classifier

        ②Comparison with GNN-based methods: GCN, GAT, BrainGNN, MVS-GCN, PopulationGNN, InceptionGCN, EV-GCN, Hi-GCN

        ③Comparison with deep neural network (DNN) based methods

        ④The classification results (NI denotes whether adopting non-image data or not):

[论文精读]Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks_第7张图片

(2)Classification Results on ADNI (NC and MCI)

        ①Models in ANDI mostly outperform those in ABIDE in that data in ABIDE is highly heterogeneous (they come from different acquisition site)

        ②The classification results (NC and AD):

[论文精读]Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks_第8张图片

        ③The classification results on NC and MCI:

[论文精读]Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks_第9张图片

        ④The classification results on pMCI and sMCI:

[论文精读]Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks_第10张图片

2.6.2. Ablation Studies

(1)Ablation Study for Local ROI-GNN

        ①They compared local ROI-GNN with GAT, GIN, GraphSAGE, ChebNet, GCN, which means adopting these SOTA models in their model (replace local ROI-GNN)

        ②They found SABP module significantly enhances the performance. Specifically, they reckon appropriate pooling is vital. Excessive pooling may retain a lot of noise, but too small pooling may cause the topology information of the graph to be compressed. Thus, they choose 0.9 pooling rate.

        ③Replaced classification results:

[论文精读]Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks_第11张图片

        ④Different pooling rate in SABP module:

[论文精读]Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks_第12张图片

(2)Ablation Study for Global Subject-GNN

        ①They replaced subject-GNN by GATConv, GINConv, GraphSAGEConv, GCNConv

        ②Replaced classification results:

[论文精读]Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks_第13张图片

        ③To evaluate the effectiveness of AWAB module, they tried with or without (adopting output of the last Cheb block as the final output) AWAB in the table above

2.6.3. Biomarker Detection

        ①They obtain weights of ROI through SABP module, and select top 10 ROI as biomarker

        ②Mutual information loss \mathcal{L}_{MI} is what they distinguish ROIs with putting important ROI weights to 1, other to 0

        ③The top 10 ROIs with the greatest impact on autism and Alzheimer's disease:

[论文精读]Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks_第14张图片

2.6.4. Discussion

        ①Acquisition site significantly affects the results, and gender impacts on classification as well in ABIDE dataset:

[论文精读]Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks_第15张图片

        ②Gender impacts more than age in ANDI:

[论文精读]Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks_第16张图片

        ③Other non-imaging data can also be used, such as IQ and genetic informatio(???你可以再写离谱一点吗我真的,这

quotient  n.商(除法所得的结果)

2.7. Conclusion

        Their model LG-GNN with local and global modules achieve excellent performance

3. Reference List

Zhang, H. et al. (2022) 'Classification of Brain Disorders in rs-fMRI via Local-to-Global Graph Neural Networks', IEEE, vol. 42 (2), pp. 444-455. doi: 10.1109/TMI.2022.3219260

你可能感兴趣的:(论文精读,深度学习,人工智能,机器学习,计算机视觉,学习,算法,图论)