图卷积网络介绍及进展学习笔记

原版视频:https://www.bilibili.com/video/av39809391/

本文传送机

Introduction

Definition

Methods

Applications

Problems and progress


 

Introduction

  • 数学上的卷积定义:

h(t)=(f*g)(t) \overset{def}{=} \int f(\tau)g(t-\tau ) d\tau =\int f(t-\tau)g(\tau)d\tau

  • CNN上的卷积:

    图卷积网络介绍及进展学习笔记_第1张图片

    • CNN卷积的特点
      • Traslation invariance
      • Weight sharing
  • 图所应用的场景(非欧几里得结构)

 欧几里得几何与非欧几里得几何

1、欧几里得几何,简称欧式几何。数学上,欧几里得几何是平面和三维空间中常见的几何,基于点线面假设。

2、非欧几里得几何是指不同于欧几里得几何学的几何体系,一般包含罗氏几何和黎曼几何

图卷积网络介绍及进展学习笔记_第2张图片

  • Why not directly use a kernal like CNN?
    • Number of neighbors
    • Order of neighbors
  • Question
    • How to define the convolution on non-Euclidean structures and extract features for machine leaning tasks?
    • Spectral vs. Spatial  谱还是空间

Definition

  • graph
    • G=(V,E)
    • V代表顶点集
    • E代表边集
  • adjacency 邻接矩阵
    • A\in \mathbb{R}^{n \times n}
    • A_{i,j}代表i, j两点是否相连
  • degree matrix 度矩阵
    • D=(D_{ii})
    • D_{ii}=\sum\nolimits_jA_{i,j}
  • Laplacian
    • L=I_N-D^{-\frac{1}{2}}AD^{-\frac{1}{2}}=U\Lambda U^T
    • U=(u_1, u_2, \dots ,u_n)    UU^T=U^TU=I
    • \Lambda=diag(\lambda_1, \dots, \lambda_n)is eigenvalues of L
    • \{\lambda_1, \dots, \lambda_n\}is called the specturm of L
    • \{u_1,\dots,u_n\}is an orthonormal basis of \mathbb{R}^n

 

  • 图卷积网络中的卷积由傅里叶变换转换而来
  • Traditional Fourier transform:
    • F(w)=\mathcal{F}[f(t)]=\int f(t)e^{-iwt}dt
    • f(t)=\mathcal{F}^{-1}[F(w)]=\frac{1}{2\pi}\int F(w)e^{iwt}dw
    • e^{iwt}可看作空间中的基
  • 转换过程
    • w \rightarrow \textup{eigenvalue}
    • The graph Fourier transform of a signal x \in \mathbb{R}^n is defined as:
      • \hat{x} = U^Tx
      • Its inverse is x = U\hat{x}
    • convolution:
      • g*x=\mathcal{F}^{-1}[\hat{g}(w)\hat{x}(w)]
    • convolution on graph:
      • g*x=U((U^Tg)\odot(U^Tx))
        • \odot is element-wise Handamard product
        • \begin{pmatrix} x_1\\ \vdots\\ x_n \end{pmatrix} \odot \begin{pmatrix} y_1\\ \vdots\\ y_n \end{pmatrix} = \begin{pmatrix} x_1 &\cdots &0\\ \vdots &\ddots &\vdots\\ 0 &\cdots &x_n \end{pmatrix} \begin{pmatrix} y_1\\ \vdots\\ y_n \end{pmatrix}
      • A filter U^Tg=g_\theta(\Lambda)=diag(\theta)
    • Then g*x=Ug_\theta U^Tx

Methods

  • Then focus on g_\theta
  • Ug_\theta U^Tx=U(\theta_i \hat{x_i})_{i=1}^n
  • There are n parameters \theta_i
  • computational complexity is O(n^3)     实际应用中,一个图网络的节点个数n通常是万级
  • Question: fewer parameters, lower complexity

 

  • 方法:多项式拟合
    • Chebyshev polynomial
      • \left\{\begin{aligned} & T_0(x)=1\\ & T_1(x)=x\\ & T_{n+1}(x)=2xT_n(x)=T_{n-1}(x) \end{aligned}\right.
      • 图卷积网络介绍及进展学习笔记_第3张图片
      • An orthogonal basis for L^2([-1,1], dy/\sqrt{1-y^2})
  • g_{\theta'}(\Lambda)\approx \sum_{k=0}^K\theta'_kT_k(\tilde{\Lambda}), \ \tilde{\Lambda}=\frac{2}{\lambda_{max}}\Lambda-I_N    
  • Then g_{\theta'}*x \approx \sum_{k=0}^K \theta'_kT_k(\tilde{L})x, \ \tilde{L}=\frac{2}{\lambda_{max}}L-I_N   
  • complexity O(K|E|), K+1 parameters 

 

  • 进一步的
    • Let K=1, it is then linear w.r.t. \tilde{L}
    • recover a rich class of convolutional filter functions by stacking multiple such layers
    • Approximate \lambda_{max} \approx 2
    • Let \theta = \theta'_0=-\theta'_1
    • g_{\theta'}*x=\theta'_0x+\theta'_1(L-I_N)x=\theta(I_N+D^{-\frac{1}{2}}AD^{-\frac{1}{2}})x
    • I_N+D^{-\frac{1}{2}} A D^{-\frac{1}{2}} \ \text{have eigenvalues} \in [0,2]
    • Renormaliztion: I_N+D^{-\frac{1}{2}}AD^{-\frac{1}{2}} \rightarrow \widetilde{D}^{-\frac{1}{2}}\widetilde{A}\widetilde{D}^{-\frac{1}{2}}

We can gereralize this definition to a signal x\in \mathbb{R}^{N\times C} with C input channels and F filters or feature maps as follows:

H^{(l+1)}=\sigma(\widetilde{D}^{-\frac{1}{2}} \widetilde{A}\widetilde{D}^{-\frac{1}{2}}H^{(l)}W^{(l)})

Applications

  • Graph Semi-supervised Classification
    • A few nodes are labeled while others are not. Assign labels to unlabeled nodes.
    • 图卷积网络介绍及进展学习笔记_第4张图片
    • Evaluate the cross-entropy error over all labeled examples:
      • \mathcal{L}=-\sum_{l\in \mathcal{Y}_L}\sum_{f=1}^FY_{lf} \ln Z_{lf}
    • Train W using gradient desecnt
    • 图卷积网络介绍及进展学习笔记_第5张图片
  • Link Prediction
  • Community detection
  • Traffic Prediction
  • Molecular properties

Problems and progress

  • Memory requirement
    • 图卷积网络介绍及进展学习笔记_第6张图片
  • -> Stochastic Training
    • 图卷积网络介绍及进展学习笔记_第7张图片
    • 思路:
      • 随机选取历史信息中的某些信息,选取的信息和历史信息一起生成下一层的信息
    • 图卷积网络介绍及进展学习笔记_第8张图片

CONTACT INFORMATION

E-Mail: [email protected]

QQ: 46611253

 

你可能感兴趣的:(Machine,Learning,Paper)