2015
DOI: 10.14778/2824032.2824057
|View full text |Cite
|
Sign up to set email alerts
|

Spatial partitioning techniques in SpatialHadoop

Abstract: SpatialHadoop is an extended MapReduce framework that supports global indexing that spatial partitions the data across machines providing orders of magnitude speedup, compared to traditional Hadoop. In this paper, we describe seven alternative partitioning techniques and experimentally study their effect on the quality of the generated index and the performance of range and spatial join queries. We found that using a 1% sample is enough to produce high quality partitions. Also, we found that the total area of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

2
127
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 105 publications
(133 citation statements)
references
References 5 publications
2
127
0
Order By: Relevance
“…In the Language layer, SpatialHadoop adds a simple and expressive high level language for spatial data types and operations. In the Storage layer, SpatialHadoop adapts traditional spatial index structures as Grid, R-tree and R + -tree, to form a two-level spatial index [27]. SpatialHadoop enriches the MapReduce layer by new components to implement efficient and scalable spatial data processing.…”
Section: Spatialhadoopmentioning
confidence: 99%
“…In the Language layer, SpatialHadoop adds a simple and expressive high level language for spatial data types and operations. In the Storage layer, SpatialHadoop adapts traditional spatial index structures as Grid, R-tree and R + -tree, to form a two-level spatial index [27]. SpatialHadoop enriches the MapReduce layer by new components to implement efficient and scalable spatial data processing.…”
Section: Spatialhadoopmentioning
confidence: 99%
“…Multiple matching methods and multiple assignment methods are adopted to address these issues [18,38]. They work perfectly for the spatial query and join operations, despite the slightly increased overhead in the latter method due to the multiple duplications.…”
Section: Boundary Object Processingmentioning
confidence: 99%
“…In order to make uniform distribution of the spatial data and ensure the node load balance, we improve the quadtree algorithm [11] and design a bet- ter partitioning strategy for spatio-temporal data, and we term this algorithm the QaDTree algorithm, which generates the global index using the improved quadtree algorithm to partition the spatio-temporal data and the local indexes using 3DR-tree index to manage the spatio-temporal data [12]. The tree non-leaf node of the global index contains four children, each of which represents a subspace area.…”
Section: The Qadtree Algorithmmentioning
confidence: 99%