Robust distributed indexing for locality-skewed workloads

Lee, Mu-Woong; Hwang, Seung-won

doi:10.1145/2396761.2398438

Cited by 2 publications

(1 citation statement)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…H can be set as a multiple of the maximum number of parallel map tasks running in a cluster (the details of setting H are discussed in the Appendix B). In general, any hash function (e.g., [28]) which can partition objects into groups that keep the same distribution of objects as the overall distribution can be adopted here. In our experiments, since the identifiers of trajectory objects in the dataset are uniformly distributed, we simply hash the objects according to their identifiers, i.e., the hash function is a simple modulo function hash(tr)=tr.id%H.…”

Section: Trajectory Joinmentioning

confidence: 99%

Scalable Algorithms for Nearest-Neighbor Joins on Big Trajectory Data

Fang

Cheng

Tang

et al. 2016

IEEE Trans. Knowl. Data Eng.

View full text Add to dashboard Cite

International audienceTrajectory data are prevalent in systems that monitor the locations of moving objects. In a location-based service, for instance, the positions of vehicles are continuously monitored through GPS; the trajectory of each vehicle describes its movement history. We study joins on two sets of trajectories, generated by two sets M and R of moving objects. For each entity in M , a join returns its k nearest neighbors from R. We examine how this query can be evaluated in cloud environments. This problem is not trivial, due to the complexity of the trajectory, and the fact that both the spatial and temporal dimensions of the data have to be handled. To facilitate this operation, we propose a parallel solution framework based on MapReduce. We also develop a novel bounding technique, which enables trajectories to be pruned in parallel. Our approach can be used to parallelize existing single-machine trajectory join algorithms. We also study a variant of the join, which can further improve query efficiency. To evaluate the efficiency and the scalability of our solutions, we have performed extensive experiments on large real and synthetic datasets

show abstract

Section: Trajectory Joinmentioning

confidence: 99%