原文地址:https://transport.ckcest.cn/Search/get/298151?db=cats_huiyi_jtxs
Traffic forecast is a typical time-series prediction problem, i.e. predicting the most likely traffic measurements (e.g. speed or traffic flow) in the next H H H time steps given the previous M M M traffic observations as,
交通预测是一个典型的时间序列预测问题,即在给定之前的 M M M 个交通观测值的情况下,预测下一个 H H H 时间步中最可能的交通测量值(例如速度或交通流量):
v ^ t + 1 , … , v ^ t + H = arg max v t + 1 , … , v t + H log P ( v t + 1 , … , v t + H ∣ v t − M + 1 , … , v t ) , (1) \hat{v}_{t+1},…,\hat{v}_{t+H}=\argmax\limits_{v_{t+1},…,v_{t+H}}\text{log}P(v_{t+1},…,v_{t+H}\mid v_{t-M+1},…,v_t) ,\tag{1} v^t+1,…,v^t+H=vt+1,…,vt+HargmaxlogP(vt+1,…,vt+H∣vt−M+1,…,vt),(1)
where v t ∈ R n v_t∈R^n vt∈Rn is an observation vector of n n n road segments at time step t t t, each element of which records historical observation for a single road segment.
其中 v t ∈ R n v_t∈R^n vt∈Rn 是 n n n 个路段在时间步长 t t t 时的观测向量,每个元素记录单个路段的历史观测值。
In this work, we define the traffic network on a graph and focus on structured traffic time series. The observation v t v_t vt is not independent but linked by pairwise connection in graph. Therefore, the data point v t v_t vt can be regarded as a graph signal that is defined on an undirected graph (or directed one) with weights w i j w_{ij} wij as shown in Figure 1. At the t t t-th time step, in graph G t = ( V t , E , W ) G_t=(V_t,E,W) Gt=(Vt,E,W), V t V_t Vt is a finite set of vertices, corresponding to the observations from n n n monitor stations in a traffic network; E E E is a set of edges, indicating the connectedness between stations; while W ∈ R n × n W∈R^{n×n} W∈Rn×n denotes the weighted adjacency matrix of G t G_t Gt.
在这项工作中,我们在一个图上定义交通网络,并重点关注结构化交通时间序列。用于观察的 v t v_t vt 不是独立的,而是由图中的两两联系联系起来的。因此,数据点 v t v_t vt 可以看作是一个图信号,它定义在一个权值为 w i j w_{ij} wij 的无向图(或有向图)上,如图1所示。在第 t t t 个时间步时,图 G t = ( V t , E , W ) G_t=(V_t,E,W) Gt=(Vt,E,W) 中, V t V_t Vt 是顶点的有限集,对应于交通网络中 n n n 个监测站的观测值; E E E 是边的集合,表示位置与位置之间的连通性; W ∈ R n × n W∈\R^{n×n} W∈Rn×n 表示 G t G_t Gt 的加权邻接矩阵。
A standard convolution for regular grids is clearly not applicable to general graphs. There are two basic approaches currently exploring how to generalize CNNs to structured data forms. One is to expand the spatial definition of a convolution [Niepert et al., 2016], and the other is to manipulate in the spectral domain with graph Fourier transforms [Bruna et al., 2013]. The former approach rearranges the vertices into certain grid forms which can be processed by normal convolutional operations. The latter one introduces the spectral framework to apply convolutions in spectral domains, often named as the spectral graph convolution. Several following-up studies make the graph convolution more promising by reducing the computational complexity from O ( n 2 ) O(n^2) O(n2) to linear [Defferrard et al., 2016; Kipf and Welling, 2016].
规则网格的标准卷积显然不适用于一般图。目前有两种基本的方法来探索如何将CNN推广到结构化数据形式。一种是扩展卷积的空间定义 [Niepert et al., 2016] ,另一种是利用图傅里叶变换在光谱域进行操作 [Bruna et al., 2013] 。前一种方法将顶点重新排列成特定的网格形式,这些网格形式可以通过普通的卷积运算进行处理。后者引入光谱框架将卷积应用于光谱域,通常称为谱图卷积。一些后续研究通过将计算复杂度从 O ( n 2 ) O(n^2) O(n2) 降低到线性,使图卷积更有发展前途 [Defferrard et al., 2016; Kipf and Welling, 2016] 。
We introduce the notion of graph convolution operator “ ∗ G *_G ∗G” based on the conception of spectral graph convolution, as the multiplication of a signal x ∈ R n x∈\R^n x∈Rn with a kernel Θ Θ Θ,
我们引入了基于谱图卷积概念的图卷积算子 “ ∗ G *_G ∗G” 的概念,它是信号 x ∈ R n x∈\R^n x∈Rn 与核 Θ Θ Θ 的乘积,
Θ ∗ G x = Θ ( L ) x = Θ ( U Λ U T ) x = U Θ ( Λ ) U T x , (2) Θ_{*_G}\ x=Θ(L)x=Θ(UΛU^\text{T})x=UΘ(Λ)U^\text{T}x, \tag{2} Θ∗G x=Θ(L)x=Θ(UΛUT)x=UΘ(Λ)UTx,(2)
where graph Fourier basis U ∈ R n × n U∈\R^{n×n} U∈Rn×n is the matrix of eigenvectors of the normalized graph Laplacian L = I n − D − 1 / 2 W D − 1 / 2 = U Λ U T ∈ R n × n L=I_n-D^{-1/2}WD^{-1/2}=UΛU^\text{T}∈\R^{n×n} L=In−D−1/2WD−1/2=UΛUT∈Rn×n ( I n I_n In is an identity matrix, D ∈ R n × n D∈R^{n×n} D∈Rn×n is the diagonal degree matrix with D i i = ∑ j W i j D_{ii}=∑_jW_{ij} Dii=∑jWij); Λ ∈ R n × n Λ∈\R^{n×n} Λ∈Rn×n is the diagonal matrix of eigenvalues of L L L, and filter Θ ( Λ ) Θ(Λ) Θ(Λ) is also a diagonal matrix. By this definition, a graph signal x x x is filtered by a kernel Θ Θ Θ with multiplication between Θ Θ Θ and graph Fourier transform U T x U^\text{T}x UTx [Shuman et al., 2013].
其中,图傅里叶基 U ∈ R n × n U∈\R^{n×n} U∈Rn×n 为归一化图拉普拉斯 L = I n − D − 1 / 2 W D − 1 / 2 = U Λ U T ∈ R n × n L=I_n-D^{-1/2} WD^{-1/2}=UΛU^\text{T}∈\R^{n×n} L=In−D−1/2WD−1/2=UΛUT∈Rn×n ( I n I_n In 为单位矩阵, D ∈ R n × n D∈\R^{n×n} D∈Rn×n 为 D i i = ∑ j W i j D_{ii}=∑_jW_{ij} Dii=∑jWij 的对角度矩阵); Λ ∈ R n × n Λ∈\R^{n×n} Λ∈Rn×n 是 L L L 的特征值的对角矩阵,滤波器 Θ ( Λ ) Θ(Λ) Θ(Λ) 也是一个对角矩阵。根据这个定义,一个图信号 x x x 被内核 Θ Θ Θ 滤波, Θ Θ Θ 与图傅里叶变换 U T x U^\text{T}x UTx 相乘 [Shuman et al., 2013] 。
[Niepert et al., 2016] Mathias Niepert, Mohamed Ahmed, and Konstantin Kutzkov. Learning convolutional neural networks for graphs. In ICML, pages 2014–2023, 2016.
[Bruna et al., 2013] Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann LeCun. Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203, 2013.
[Kipf and Welling, 2016] Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
[Shuman et al., 2013] David I Shuman, Sunil K Narang, Pascal Frossard, Antonio Ortega, and Pierre Vandergheynst. The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Processing Magazine, 30(3):83–98, 2013.
[Defferrard et al., 2016] Michae¨l Defferrard, Xavier Bresson, and Pierre Vandergheynst. Convolutional neural networks on graphs with fast localized spectral filtering. In NIPS, pages 3844–3852, 2016.