Accurate prediction of traffic information (i.e., traffic flow, travel time, traffic speed, etc.) is a key component of Intelligent Transportation System (ITS). Traffic speed is an important indicator to evaluate traffic efficiency. Up to date, although a few studies have considered the periodic feature in traffic prediction, very few studies comprehensively evaluate the impact of periodic component on statistical and machine learning prediction models. This paper selects several representative statistical models and machine learning models to analyze the influence of periodic component on short-term speed prediction under different scenarios: (1) multi-horizon ahead prediction (5, 15, 30, 60 minutes ahead predictions), (2) with and without periodic component, (3) two data aggregation levels (5-minute and 15-minute), (4) peak hours and off-peak hours. Specifically, three statistical models (i.e., space time (ST) model, vector autoregressive (VAR) model, autoregressive integrated moving average (ARIMA) model) and three machine learning approaches (i.e., support vector machines (SVM) model, multi-layer perceptron (MLP) model, recurrent neural network (RNN) model) are developed and examined. Furthermore, the periodic features of the speed data are considered via a hybrid prediction method, which assumes that the data consist of two components: a periodic component and a residual component. The periodic component is described by a trigonometric regression function, and the residual component is modeled by the statistical models or the machine learning approaches. The important conclusions can be summarized as follows: (1) the multi-step ahead prediction accuracy improves when considering the periodic component of speed data for both three statistical models and three machine learning models, especially in the peak hours; (2) considering the impact of periodic component for all models, the prediction performance improvement gradually becomes larger as the time step increases; (3) under the same prediction horizon, the prediction performance of all models for 15-minute speed data is generally better than that for 5-minute speed data. Overall, the findings in this paper suggest that the proposed hybrid prediction approach is effective for both statistical and machine learning models in short-term speed prediction.
This paper develops a semi-nonparametric Poisson regression model to analyze motor vehicle crash frequency data collected from rural multilane highway segments in California, US. Motor vehicle crash frequency on rural highway is a topic of interest in the area of transportation safety due to higher driving speeds and the resultant severity level. Unlike the traditional Negative Binomial (NB) model, the semi-nonparametric Poisson regression model can accommodate an unobserved heterogeneity following a highly flexible semi-nonparametric (SNP) distribution. Simulation experiments are conducted to demonstrate that the SNP distribution can well mimic a large family of distributions, including normal distributions, log-gamma distributions, bimodal and trimodal distributions. Empirical estimation results show that such flexibility offered by the SNP distribution can greatly improve model precision and the overall goodness-of-fit. The semi-nonparametric distribution can provide a better understanding of crash data structure through its ability to capture potential multimodality in the distribution of unobserved heterogeneity. When estimated coefficients in empirical models are compared, SNP and NB models are found to have a substantially different coefficient for the dummy variable indicating the lane width. The SNP model with better statistical performance suggests that the NB model overestimates the effect of lane width on crash frequency reduction by 83.1%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.