Grids are commonly used as histograms to process spatial data in order to detect frequent patterns, predict destinations, or to infer popular places. However, they have not been previously used for GPS trajectory similarity searches or retrieval in general. Instead, slower and more complicated algorithms based on individual point-pair comparison have been used. We demonstrate how a grid representation can be used to compute four different route measures: novelty, noteworthiness, similarity, and inclusion. The measures may be used in several applications such as identifying taxi fraud, automatically updating GPS navigation software, optimizing traffic, and identifying commuting patterns. We compare our proposed route similarity measure, C-SIM, to eight popular alternatives including Edit Distance on Real sequence (EDR) and Frechet distance. The proposed measure is simple to implement and we give a fast, linear time algorithm for the task. It works well under noise, changes in sampling rate, and point shifting. We demonstrate that by using the grid, a route similarity ranking can be computed in real-time on the Mopsi2014 1 route dataset, which consists of over 6,000 routes. This ranking is an extension of the most similar route search and contains an ordered list of all similar routes from the database. The real-time search is due to indexing the cell database and comes at the cost of spending 80% more memory space for the index. The methods are implemented inside the Mopsi 2 route module.
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. Highlights Token-level measures outperform character-level measures when the order of the words varies Q-grams provide a good compromise between token-and character-level measures Token-level measures are significantly outperformed by their soft variants Soft measures based on set-matching methods perform best when using q-gram at the character level The performance of similarity measures varies depending on the type of the datasets
Road networks are essential nowadays, especially for people travelling to large, unfamiliar cities. Moreover, cities are constantly growing and road networks need periodic updates to provide reliable information. We propose an automatic method to generate the road network using a GPS trajectory dataset. The method, called CellNet, works by first detecting the intersections (junctions) using a clustering-based technique and then creating the road segments in-between. We compare CellNet against conceptually different alternatives using Chicago and Joensuu datasets. The results show that CellNet provides better accuracy and is less sensitive to parameter setup. The size of the generated road network is only 25% of the networks produced by other methods. This implies that the network provided by CellNet has much less redundancy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.