2021
DOI: 10.3390/ijgi10110763
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Group K Nearest-Neighbor Spatial Query Processing in Apache Spark

Panagiotis Moutafis,
George Mavrommatis,
Michael Vassilakopoulos
et al.

Abstract: Aiming at the problem of spatial query processing in distributed computing systems, the design and implementation of new distributed spatial query algorithms is a current challenge. Apache Spark is a memory-based framework suitable for real-time and batch processing. Spark-based systems allow users to work on distributed in-memory data, without worrying about the data distribution mechanism and fault-tolerance. Given two datasets of points (called Query and Training), the group K nearest-neighbor (GKNN) query … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 45 publications
0
2
0
Order By: Relevance
“…Voronoi diagram (VD) plays the role as a very effective tool in computing geometries. In recent years, VDs have been widely used in spatial databases to describe spatial neighbor relationships; they also are used for realization of a spatial neighbor query, spatial interpolation, and buffer analysis (Moutafis et al, 2021).…”
Section: Voronoi Diagrammentioning
confidence: 99%
“…Voronoi diagram (VD) plays the role as a very effective tool in computing geometries. In recent years, VDs have been widely used in spatial databases to describe spatial neighbor relationships; they also are used for realization of a spatial neighbor query, spatial interpolation, and buffer analysis (Moutafis et al, 2021).…”
Section: Voronoi Diagrammentioning
confidence: 99%
“…Because Hadoop MapReduce is a disk-based distributed computing framework, its response time is relatively slow, so it is not suitable for online queries. Therefore, more and more researchers have proposed spatial query processing algorithms based on the Spark framework [ 4 , 5 , 6 , 7 , 24 , 25 , 26 , 27 , 28 ], which is an in-memory computing framework with a faster processing speed. Xie [ 6 ] proposed a spatial big data analysis system based on Spark, which expanded the Spark SQL engine and supported the construction of an RDD-based memory index, thus effectively supporting various query operations such as range query and kNN query.…”
Section: Related Workmentioning
confidence: 99%
“…Compared with the original Spark, SparkNN significantly improves the average query time. Moutafis [ 26 ] proposed the first distributed GKNN query algorithm in Apache Spark, and this method proved to be more efficient than Apache Hadoop.…”
Section: Related Workmentioning
confidence: 99%