Given a spatial network and a collection of activities (e.g., pedestrian fatality reports, crime reports), Significant Linear Hotspot Discovery (SLHD) finds all shortest paths in the spatial network where the concentration of activities is statistically significantly high. SLHD is important for societal applications in transportation safety or public safety such as finding paths with significant concentrations of accidents or crimes. SLHD is challenging because 1) there are a potentially large number of candidate paths (∼ 10 16 ) in a given dataset with millions of activities and road network nodes and 2) test statistic (e.g., density ratio) is not monotonic. Hotspot detection approaches on Euclidean space (e.g., SaTScan) may miss significant paths since a large fraction of an area bounded by shapes in Euclidean space for activities on a path will be empty. Previous network-based approaches consider only paths between road intersections but not activities. This paper proposes novel models and algorithms for discovering statistically significant linear hotspots using the algorithms of neighbor node filter, shortest path tree pruning, and Monte Carlo speedup. We present case studies comparing the proposed approaches with existing techniques on real data. Experimental results show that the proposed algorithms yield substantial computational savings without reducing result quality.
Given a set of trajectories on a road network, the goal of the All-Pair Network Trajectory Similarity (APNTS) problem is to calculate the similarity between all trajectories using the Network Hausdorff Distance. This problem is important for a variety of societal applications, such as facilitating greener travel via bicycle corridor identification. The APNTS problem is challenging due to the high cost of computing the exact Network Hausdorff Distance between trajectories in spatial big datasets. Previous work on the APNTS problem takes over 16 hours of computation time on a real-world dataset of bicycle GPS trajectories in Minneapolis, MN. In contrast, this paper focuses on a scalable method for the APNTS problem using the idea of row-wise computation, resulting in a computation time of less than 6 minutes on the same datasets. We provide a case study for transportation services using a data-driven approach to identify primary bicycle corridors for public transportation by leveraging emerging GPS trajectory datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.