原文地址:https://ojs.aaai.org/index.php/AAAI/article/view/3881
原文代码地址:https://github.com/ wanhuaiyu/ASTGCN
Abstract: Forecasting the traffic flows is a critical issue for researchers and practitioners in the field of transportation. However, it is very challenging since the traffic flows usually show high nonlinearities and complex patterns. Most existing traffic flow prediction methods, lacking abilities of modeling the dynamic spatial-temporal correlations of traffic data, thus cannot yield satisfactory prediction results. In this paper, we propose a novel attention based spatial-temporal graph convolutional network (ASTGCN) model to solve traffic flow forecasting problem. ASTGCN mainly consists of three independent components to respectively model three temporal properties of traffic flows, i.e., recent, daily-periodic and weekly-periodic dependencies. More specifically, each component contains two major parts: 1) the spatial-temporal at-tention mechanism to effectively capture the dynamic spatial-temporal correlations in traffic data; 2) the spatial-temporal convolution which simultaneously employs graph convolutions to capture the spatial patterns and common standard convolutions to describe the temporal features. The output of the three components are weighted fused to generate the final prediction results. Experiments on two real-world datasets from the Caltrans Performance Measurement System (PeMS) demonstrate that the proposed ASTGCN model outperforms the state-of-the-art baselines.
摘要: 交通流预测是交通领域研究人员和实践者的一个关键问题。然而,由于交通流通常具有高度的非线性和复杂的模式,这是非常具有挑战性的。现有的大多数交通流预测方法缺乏对交通数据动态时空相关性建模的能力,无法获得满意的预测结果。本文提出了一种新的基于注意力的时空图卷积网络(ASTGCN)模型来解决交通流预测问题。ASTGCN主要由三个独立组件组成,分别对交通流的三个时间属性进行建模,即最近依赖性、日周期依赖性和周周期依赖性。具体而言,每个分量包含两个主要部分:1)有效捕捉交通数据动态时空相关性的时空注意机制;2)时空卷积,同时使用图卷积捕捉空间模式,同时使用通用标准卷积描述时间特征。对三个分量的输出进行加权融合,得到最终的预测结果。在来自Caltrans性能测量系统(PeMS)的两个真实数据集上的实验表明,提出的ASTGCN模型优于最先进的基线。
Recently, many countries are committed to vigorously develop the Intelligent Transportation System (ITS) (Zhang et al. 2011) to help for efficient traffic management. Traffic forecasting is an indispensable part of ITS, especially on the highway which has large traffic flows and fast driving speed. Since the highway is relatively closed, once a congestion occurs, it will seriously affect the traffic capacity. Traffic flow is a fundamental measurement reflecting the state of the highway. If it can be predicted accurately in advance, according to this, traffic management authorities will be able to guide vehicles more reasonably to enhance the running efficiency of the highway network.
近年来,许多国家都致力于大力发展智能交通系统(ITS) (Zhang et al. 2011),以帮助高效的交通管理。交通预测是ITS不可缺少的一部分,尤其是在交通流量大、行驶速度快的高速公路上。由于高速公路相对封闭,一旦发生拥堵,将严重影响通行能力。交通流量是反映公路状况的基本指标。如果能提前准确预测,交通管理部门就能更合理地引导车辆,提高公路网的运行效率。
Highway traffic flow forecasting is a typical problem of spatial-temporal data forecasting. Traffic data are recorded at fixed points in time and at fixed locations distributed in continuous space. Apparently, the observations made at neighboring locations and time stamps are not independent but dynamically correlated with each other. Therefore, the key to solve such problems is to effectively extracting the spatial-temporal correlations of data. Fig. 1 demonstrates the spatial-temporal correlations of traffic flows (also can be vehicle speed, lane occupancy, etc.). The bold line between two points represents their mutual influence strength. The darker the color of line is, the greater the influence is. In the spatial dimension (Fig. 1(a)), we can find that different locations have different impacts on A and even a same location has varying influence on A as time goes by. In the temporal dimension (Fig. 1(b)), the historical observations of different locations have varying impacts on A’s traffic states at different times in the future. In conclusion, the correlations in traffic data on the highway network show strong dynamics in both the spatial dimension and temporal dimension. How to explore nonlinear and complex spatial-temporal data to discover its inherent spatial-temporal patterns and to make accurate traffic flow predictions is a very challenging issue.
公路交通流预测是一个典型的时空数据预测问题。交通数据记录在固定的时间点和分布在连续空间的固定位置。显然,相邻位置的观测与时间戳并不是独立的,而是动态相关的。因此,有效提取数据的时空相关性是解决这类问题的关键。图1展示了交通流的时空相关性(也可以是车速、车道占用率等)。两点之间的粗线代表了它们相互影响的强度。线条的颜色越深,影响越大。在空间维度上(图1(a)),我们可以发现不同的位置对a的影响是不同的,甚至同一位置随着时间的推移对a的影响也是不同的。在时间维度上(图1(b)),不同位置的历史观测对A未来不同时间的交通状态有不同的影响。综上所述,高速公路网交通数据相关性在空间维数和时间维数上均表现出较强的动态性。如何探索复杂的非线性时空数据,发现其内在的时空模式,并进行准确的交通流预测是一个非常具有挑战性的问题。
Fortunately, with the development of the transportation industry, many cameras, sensors and other information collection devices have been deployed on the highway. Each device is placed at a unique geospatial location, constantly generating time series data about traffic. These devices have accumulated a large amount of rich traffic time series data with geographic information, providing a solid data foundation for traffic forecasting. Many researchers have already made great efforts to solve such problems. Early, time series analysis models are employed for traffic prediction problems. Yet, it is difficult for them to handle the unstable and nonlinear data in practice. Later, traditional machine learning methods are developed to model more complex data, but it is still difficult for them to simultaneously consider the spatial-temporal correlations of high-dimensional traffic data. Moreover, the prediction performances of this kind of methods rely heavily on feature engineering, which often requires lots of experiences from experts in the corresponding domain. In recent years, many researchers use deep learning methods to deal with high-dimensional spatial-temporal data, i.e., convolutional neural networks (CNN) are adopted to effectively extract the spatial features of grid-based data; graph convolutional neural networks (GCN) are used for describing spatial correlation of graph-based data. However, these methods still fail to simultaneously model the spatial-temporal features and dynamic correlations of traffic data.
幸运的是,随着交通运输行业的发展,许多摄像头、传感器等信息采集设备已经部署在高速公路上。每个设备都被放置在一个独特的地理空间位置,不断生成有关交通的时间序列数据。这些设备积累了大量丰富的具有地理信息的交通时间序列数据,为交通预测提供了坚实的数据基础。许多研究人员已经为解决这类问题做出了很大的努力。早期,交通预测问题采用时间序列分析模型。但在实际应用中,对不稳定、非线性数据的处理是比较困难的。后来,传统的机器学习方法被开发用于对更复杂的数据建模,但仍然难以同时考虑高维交通数据的时空相关性。此外,这种方法的预测性能很大程度上依赖于特征工程,往往需要来自相关领域专家的大量经验。近年来,许多研究人员使用深度学习方法处理高维时空数据,即利用卷积神经网络(CNN)有效提取网格数据的空间特征;图卷积神经网络(GCN)用于描述基于图的数据的空间相关性。然而,这些方法仍然不能同时建模交通数据的时空特征和动态相关性。
In order to tackle the above challenges, we propose a novel deep learning model: Attention based Spatial-Temporal Graph Convolution Network (ASTGCN) to collectively predict traffic flow at every location on the traffic network. This model can process the traffic data directly on the original graph-based traffic network and effectively capture the dynamic spatial-temporal features. The main contributions of this paper are summarized as follows:
为了应对上述挑战,我们提出了一种新的深度学习模型: 基于注意力的时空图卷积网络 (ASTGCN),用于集体预测交通网络中每个位置的交通流。该模型可以直接在原始的基于图的交通网络上处理交通数据,有效地捕捉动态的时空特征。本文的主要贡献如下:
我们开发了一种时空注意机制来学习交通数据的动态时空相关性。具体地说,空间注意被用来模拟不同位置之间复杂的空间相关性。采用时间注意的方法来捕捉不同时间之间的动态时间相关性。
为了对交通数据的时空相关性进行建模,设计了一种新的时空卷积模块。它包括用于从原始的基于图的交通网络结构中获取空间特征的图卷积和用于描述附近时间片相关性的时间维卷积。
在现实世界的高速公路交通数据集上进行了大量实验,验证了我们的模型与现有基线相比具有最佳的预测性能。
Traffic forecasting After years of continuous researches and practices, many achievements have been made in the studies about traffic forecasting. The statistical models used for traffic prediction include HA, ARIMA (Williams and Hoel 2003), VAR (Zivot and Wang 2006), etc. These approaches require data to satisfy some assumptions, but traffic data is too complex to satisfy these assumptions, so they usually perform poorly in practice. Machine learning methods such as KNN (Van Lint and Van Hinsbergen 2012) and SVM (Jeong et al. 2013) can model more complex data, but they need careful feature engineering. Since deep learning has brought about breakthroughs in many domains, such as speech recognition and image processing, more and more researchers apply deep learning to spatial-temporal data prediction. Zhang et al. (2018) designed a ST-ResNet model based on the residual convolution unit to predict crowd flows. Yao et al. (2018b) proposed a method to predict traffic by integrating CNN and long-short term memory (LSTM) to jointly model both spatial and temporal dependencies. Yao et al. (2018a) further proposed a Spatial-Temporal Dynamic Network for taxi demand prediction which can learn the similarity between locations dynamically. Although the spatial-temporal features of the traffic data can be extracted by these model, their limitation is that the input must be standard 2D or 3D grid data.
交通预测 经过多年的不断研究和实践,交通预测的研究已经取得了许多成果。用于交通预测的统计模型有HA、ARIMA (Williams and Hoel 2003)、VAR (Zivot and Wang 2006)等。这些方法需要数据满足某些假设,但交通数据过于复杂,无法满足这些假设,因此在实际应用中通常表现不佳。机器学习方法如KNN (Van Lint and Van Hinsbergen 2012) 和SVM (Jeong et al. 2013) 可以为更复杂的数据建模,但它们需要仔细的特征工程。由于深度学习在语音识别、图像处理等多个领域取得了突破性进展,越来越多的研究者将深度学习应用于时空数据预测。Zhang et al. (2018) 设计了一种基于残差卷积单元的ST-ResNet模型来预测人群流量。Yao et al. (2018b) 提出了一种通过整合CNN和长短时记忆(LSTM)来联合建模空间和时间相关性的交通预测方法。Yao et al. (2018a) 进一步提出了一种用于出租车需求预测的时空动态网络,该网络可以动态学习地点之间的相似性。虽然这些模型可以提取交通数据的时空特征,但其局限性在于输入必须是标准的2D或3D网格数据。
Convolutions on graphs he traditional convolution can effectively extract the local patterns of data, but it can only be applied for the standard grid data. Recently, the graph convolution generalizes the traditional convolution to data of graph structures. Two mainstreams of graph convolution methods are the spatial methods and the spectral methods. The spatial methods directly perform convolution filters on a graph’s nodes and their neighbors. So, the core of this kind of methods is to select the neighborhood of nodes. Niepert, Ahmed, and Kutzkov (2016) proposed a heuristic linear method to select the neighborhood of every center node, which achieved good results in social network tasks. Li et al. (2018) introduced graph convolutions into human action recognition tasks. Several partitioning strategies were proposed here to divide the neighborhood of each node into different subsets and to ensure the numbers of each node’s subsets are equal. The spectral methods, in which the locality of the graph convolution is considered by spectral analysis. A general graph convolution framework based on the Graph Laplacian is proposed by Bruna et al. (2014), then Defferrard, Bresson, and Vandergheynst (2016) optimized the method by using Chebyshev polynomial approximation to realize eigenvalue decomposition. Yu, Yin, and Zhu (2018) proposed a gated graph convolution network for traffic prediction based on this method, but the model does not consider the dynamic spatial-temporal correlations of traffic data.
图卷积 传统的卷积可以有效地提取数据的局部模式,但只能应用于标准网格数据。近年来,图卷积将传统的卷积推广到图结构数据中。图卷积的两种主流方法是空间方法和谱方法。空间方法直接对图的节点及其邻近节点执行卷积滤波。因此,这类方法的核心是节点的邻域选择。Niepert, Ahmed, and Kutzkov (2016) 提出了一种启发式线性方法来选择每个中心节点的邻域,在社交网络任务中取得了很好的效果。Li et al. (2018) 将图卷积引入人类动作识别任务。提出了几种分区策略,将每个节点的邻域划分为不同的子集,并保证每个节点子集的数量相等。谱方法,谱分析考虑了图卷积的局部性。Bruna et al. (2014) 提出了一种基于graph Laplacian的一般图卷积框架,然后Defferrard, Bresson, and Vandergheynst (2016) 利用Chebyshev多项式逼近对该方法进行优化,实现了特征值分解。Yu, Yin, and Zhu (2018) 基于该方法提出了一种门控图卷积网络,用于交通预测,但该模型没有考虑交通数据的动态时空相关性。
Attention mechanism Recently, attention mechanisms have been widely used in various tasks such as natural language processing, image caption and speech recognition. The goal of the attention mechanism is to select information that is relatively critical to the current task from all input. Xu et al. (2015) proposed two attention mechanisms in the image description task and adopted a visualization method to intuitively show the effect of the attention mechanism. For classifying nodes of a graph, Velickovic et al. (2018) lever-aged self-attentional layers to process graph-structured data by neural networks and achieved state-of-the-art results. To forecast the time series, Liang et al. (2018) proposed a multi-level attention network to adaptively adjust the correlations among multiple geographic sensor time series. However, it is time-consuming in practice since a separate model needs to be trained for each time series.
注意力机制 近年来,注意机制被广泛应用于自然语言处理、图像标题和语音识别等领域。注意机制的目标是从所有的输入中选择对当前任务相对重要的信息。Xu et al. (2015) 在图像描述任务中提出了两种注意机制,并采用可视化方法直观地展示了注意机制的效果。为了对图的节点进行分类,Velickovic et al. (2018) 利用自我注意层通过神经网络处理图结构数据,并取得了最先进的结果。为了预测时间序列,Liang et al. (2018) 提出了一种多级注意网络,自适应调整多个地理传感器时间序列之间的相关性。然而,由于每个时间序列都需要训练一个单独的模型,在实践中这是非常耗时的。
Motivated by the studies mentioned above, considering the graph structure of the traffic network and the dynamic spatio-temporal patterns of the traffic data, we simultaneously employ graph convolutions and the attention mechanisms to model the network-structure traffic data.
受上述研究的启发,我们考虑交通网络的图结构和交通数据的动态时空格局,同时采用图卷积和注意机制对网络结构的交通数据建模。
Zhang, J.; Wang, F.Y.; Wang, K.; Lin, W.H.; Xu, X.; and Chen, C. 2011. Data-driven intelligent transportation systems: A survey. IEEE Transactions on Intelligent Transportation Systems 12(4):1624–1639.
Williams, B. M., and Hoel, L. A. 2003. Modeling and forecasting vehicular traffic flow as a seasonal ARIMA process: Theoretical basis and empirical results. Journal of transportation engineering 129(6):664–672.
Zivot, E., and Wang, J. 2006. Vector autoregressive models for multivariate time series. Modeling Financial Time Series with S-PLUS® 385–429.
Van Lint, J., and Van Hinsbergen, C. 2012. Short-term traffic and travel time prediction models. Artificial Intelligence Applications to Critical Transportation Issues 22(1):22–41.
Jeong, Y.S.; Byon, Y.J.; CastroNeto, M. M.; and Easa, S. M. 2013. Supervised weighting-online learning algorithm for short-term traffic flow prediction. IEEE Transactions on Intelligent Transportation Systems 14(4):1700–1707.
Zhang, J.; Zheng, Y.; Qi, D.; Li, R.; Yi, X.; and Li, T. 2018. Predicting citywide crowd flows using deep spatio-temporal residual networks. Artificial Intelligence 259:147–166.
Yao, H.; Wu, F.; Ke, J.; Tang, X.; Jia, Y.; Lu, S.; Gong, P.; and Ye, J. 2018b. Deep multi-view spatial-temporal network for taxi demand prediction. In AAAI Conference on Artificial Intelligence, 2588–2595.
Yao, H.; Tang, X.; Wei, H.; Zheng, G.; Yu, Y.; and Li, Z. 2018a. Modeling spatial-temporal dynamics for traffic prediction. arXiv preprint arXiv:1803.01254.
Niepert, M.; Ahmed, M.; and Kutzkov, K. 2016. Learning convolutional neural networks for graphs. In International conference on machine learning, 2014–2023.
Li, C.; Cui, Z.; Zheng, W.; Xu, C.; and Yang, J. 2018. Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition. In AAAI Conference on Artificial Intelli- gence, 3482–3489.
Bruna, J.; Zaremba, W.; Szlam, A.; and Lecun, Y. 2014. Spectral networks and locally connected networks on graphs. In International Conference on Learning Representations.
Defferrard, M.; Bresson, X.; and Vandergheynst, P. 2016. Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in Neural Information Processing Systems, 3844–3852.
Yu, B.; Yin, H.; and Zhu, Z. 2018. Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting. In International Joint Conference on Artificial Intelligence, 3634–3640.
Xu, K.; Ba, J.; Kiros, R.; Cho, K.; Courville, A.; Salakhudi- nov, R.; Zemel, R.; and Bengio, Y. 2015. Show, attend and tell: Neural image caption generation with visual attention. In International conference on machine learning, 2048–2057.
Velickovic, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; and Bengio, Y. 2018. Graph attention networks. In International Conference on Learning Representations.
Liang, Y.; Ke, S.; Zhang, J.; Yi, X.; and Zheng, Y. 2018. GeoMAN: Multi-level Attention Networks for Geo-sensory Time Series Prediction. In International Joint Conference on Artificial Intelligence, 3428–3434.