Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073)
DOI: 10.1109/icde.2000.839452
|View full text |Cite
|
Sign up to set email alerts
|

Data redundancy and duplicate detection in spatial join processing

Abstract: The Partitioned Based Spatial-Merge Join (PBSM) of Patel and DeWitt and the Size Separation Spatial Join (S 3 J) of Koudas and Sevcik are considered to be among the most efficient methods for processing spatial (intersection) joins on two or more spatial relations. Both methods do not assume the presence of pre-existing spatial indices on the relations. In this paper, we propose several improvements of these join algorithms. In particular, we deal with the impact of data redundancy and duplicate detection on t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
63
0

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 68 publications
(65 citation statements)
references
References 21 publications
1
63
0
Order By: Relevance
“…Doing so has the advantage that only elements in the same partition need to be compared to perform the spatial join. Replicating elements, however, has several disadvantages: 1) replicated elements need more space on disk as well as more disk reads and more comparisons for the join and 2) results may be detected twice and deduplication is required (at runtime [25] or at the end).…”
Section: B Space-oriented Partitioningmentioning
confidence: 99%
“…Doing so has the advantage that only elements in the same partition need to be compared to perform the spatial join. Replicating elements, however, has several disadvantages: 1) replicated elements need more space on disk as well as more disk reads and more comparisons for the join and 2) results may be detected twice and deduplication is required (at runtime [25] or at the end).…”
Section: B Space-oriented Partitioningmentioning
confidence: 99%
“…This means that queries such as those that seek the length of all objects in a particular spatial region will have to remove duplicate objects before reporting the total length. Nevertheless, methods have been developed that avoid these duplicates by making use of the geometry of the type of the data that is being represented (e.g., (Aref and Samet, 1992;Aref and Samet, 1994;Dittrich and Seeger, 2000)). Note that the result of constraining the positions of the partitions means that there is a limit on the possible sizes of the resulting cells (e.g., a power of 2 in the case of a quadtree variant).…”
Section: Methods Based On Spatial Occupancymentioning
confidence: 99%
“…First, the R*-tree index of R P is built. Next, each spatial object s in S P is traversed; R*-tree is searched for the spatial objects in R P whose MBR overlaps with the MBR of s. Prior to the spatial predicate verification, we use the reference point method [37] to avoid duplicates. If the reference point of r and s is not in the partition, the current calculation is terminated.…”
Section: In-memory Spatial Joinmentioning
confidence: 99%