A key problem in location-based modeling and forecasting lies in identifying suitable spatial and temporal resolutions. In particular, judicious spatial partitioning can play a significant role in enhancing the performance of location-based forecasting models. In this work, we investigate two widely used tessellation strategies for partitioning city space, in the context of real-time taxi demand forecasting. Our study compares (i) Geohash tessellation, and (ii) Voronoi tessellation, using two distinct taxi demand datasets, over multiple time scales. For the purpose of comparison, we employ classical time-series tools to model the spatio-temporal demand. Our study finds that the performance of each tessellation strategy is highly dependent on the city geography, spatial distribution of the data, and the time of the day, and that neither strategy is found to perform optimally across the forecast horizon. We propose a hybrid tessellation algorithm that picks the best tessellation strategy at each instant, based on their performance in the recent past. Our hybrid algorithm is a non-stationary variant of the wellknown HEDGE algorithm for choosing the best advice from multiple experts. We show that the hybrid tessellation strategy performs consistently better than either of the two strategies across the data sets considered, at multiple time scales, and with different performance metrics. We achieve an average accuracy of above 80% per km 2 for both datasets considered at 60 minute aggregation levels.
In this paper, we explore various statistical techniques for anomaly detection in conjunction with the popular Long Short-Term Memory (LSTM) deep learning model for transportation networks. We obtain the prediction errors from an LSTM model, and then apply three statistical models based on (i) the Gaussian distribution, (ii) Extreme Value Theory (EVT), and (iii) the Tukey's method. Using statistical tests and numerical studies, we find strong evidence against the widely employed Gaussian distribution based detection rule on the prediction errors. Next, motivated by fundamental results from Extreme Value Theory, we propose a detection technique that does not assume any parent distribution on the prediction errors. Through numerical experiments conducted on several real-world traffic data sets, we show that the EVTbased detection rule is superior to other detection rules, and is supported by statistical evidence.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.