Big Data Analytics for Network Level Short-Term Travel Time Prediction with Hierarchical LSTM and Attention

Tianya T. Zhang, Ying Ye

The travel time data collected from widespread traffic monitoring sensors necessitate big data analytic tools for querying, visualization, and identifying meaningful traffic patterns. This paper utilizes a large-scale travel time dataset from Caltrans Performance Measurement System (PeMS) system that is an overflow for traditional data processing and modeling tools. To overcome the challenges of the massive amount of data, the big data analytic engines Apache Spark and Apache MXNet are applied for data wrangling and modeling. Seasonality and autocorrelation were performed to explore and visualize the trend of time-varying data. Inspired by the success of the hierarchical architecture for many Artificial Intelligent (AI) tasks, we consolidate the cell and hidden states passed from low-level to the high-level LSTM with an attention pooling similar to how the human perception system operates. The designed hierarchical LSTM model can consider the dependencies at different time scales to capture the spatial-temporal correlations of network-level travel time. Another self-attention module is then devised to connect LSTM extracted features to the fully connected layers, predicting travel time for all corridors instead of a single link/route. The comparison results show that the Hierarchical LSTM with Attention (HierLSTMat) model gives the best prediction results at 30-minute and 45-min horizons and can successfully forecast unusual congestion. The efficiency gained from big data analytic tools was evaluated by comparing them with popular data science and deep learning frameworks.

Knowledge Graph



Sign up or login to leave a comment