Sibling clustering of tree-based spatial indexes for efficient spatial query processing

Kim, Kihong; Sang, Kun

doi:10.1145/288627.288686

Cited by 10 publications

(7 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…"Sibling indexing" is used in the work by [2] and [9]. However, creating external indices considerably increases the amount of meta-data and memory needed.…”

Section: Related Workmentioning

confidence: 99%

Sibling‐First Data Organization for Parse‐Free XML Data Processing

Homayounfar

Wang²

2006

International Journal of Web Information Systems

View full text Add to dashboard Cite

If you would like to write for this, or any other Emerald publication, then please use our Emerald for Authors service information about how to choose which publication to write for and submission guidelines are available for all. Please visit www.emeraldinsight.com/authors for more information. About Emerald www.emeraldinsight.comEmerald is a global publisher linking research and practice to the benefit of society. The company manages a portfolio of more than 290 journals and over 2,350 books and book series volumes, as well as providing an extensive range of online products and additional customer resources and services.Emerald is both COUNTER 4 and TRANSFER compliant. The organization is a partner of the Committee on Publication Ethics (COPE) and also works with Portico and the LOCKSS initiative for digital archive preservation.Abstract-XML is becoming one of the most important structures for data exchange on the web. Despite having many advantages, XML structure imposes several major obstacles to large document processing. Inconsistency between the linear nature of the current algorithms (e.g. for caching and prefetch) used in operating systems and databases, and the non-linear structure of XML data makes XML processing more costly. In addition to verbosity (e.g. tag redundancy), interpreting (i.e. parsing) depthfirst (DF) structure of XML documents is a significant overhead to processing applications (e.g. query engines). Recent research on XML query processing has learned that sibling clustering can improve performance significantly. However, the existing clustering methods are not able to avoid parsing overhead as they are limited by larger document sizes. In this research, We have developed a better data organization for native XML databases, named sibling-first (SF) format that improves query performance significantly. SF uses an embedded index for fast accessing to child nodes. It also compresses documents by eliminating extra information from the original DF format. The converted SF documents can be processed for XPath query purposes without being parsed. We have implemented the SF storage in virtual memory as well as a format on disk. Experimental results with real data have showed that significantly higher performance can be achieved when XPath queries are conducted on very large SF documents.

show abstract

“…"Sibling indexing" is used in the work by [2] and [9]. However, creating external indices considerably increases the amount of meta-data and memory needed.…”

Section: Related Workmentioning

confidence: 99%

Sibling‐First Data Organization for Parse‐Free XML Data Processing

Homayounfar

Wang²

2006

International Journal of Web Information Systems

View full text Add to dashboard Cite

show abstract

“…First, as shown in [24], RJ is significantly faster than HJ in terms of CPU-time. Second, as shown in [16], when a R-tree packing method that places sibling nodes in sequence is used, the I/O performance of RJ is significantly improved. In the rest of the paper, we do not consider the difference between random and sequential I/O accesses.…”

Section: Spatial Hash-joinmentioning

confidence: 99%

“…Files AS, AL (http://www.maproom.psu.edu/dcw/) comprise street and railroad segments, respectively, of Germany. T1 contains streets and T2 river and railroad segments of California [7]; both files are commonly used to benchmark spatial join algorithms [5], [21], [14], [16]. The synthetic files G1 and G2 were created according to a Gaussian distribution with 16 clusters.…”

Section: Experimental Evaluationmentioning

confidence: 99%

Slot index spatial join

Mamoulis

Papadias

2003

IEEE Trans. Knowl. Data Eng.

View full text Add to dashboard Cite

Abstract-Efficient processing of spatial joins is very important due to their high cost and frequent application in spatial databases and other areas involving multidimensional data. This paper proposes slot index spatial join (SISJ), an algorithm that joins a nonindexed data set with one indexed by an R-tree. We explore two optimization techniques that reduce the space requirements and the computational cost of SISJ and we compare it, analytically and experimentally, with other spatial join methods for two cases: 1) when the nonindexed input is read from disk and 2) when it is an intermediate result of a preceding database operator in a complex query plan. The importance of buffer splitting between consecutive join operators is also demonstrated through a two-join case study and a method that estimates the optimal splitting is proposed. Our evaluation shows that SISJ outperforms alternative methods in most cases and is suitable for limited memory conditions.

show abstract

“…First, as we show in section 3, RJ is significantly faster than HJ in terms of CPU-time. Second, as shown in [KC98], when a R-tree packing method that places sibling nodes in sequence is used, the I/O performance of RJ in terms of I/O is significantly improved. In the rest of the paper, we will not consider the difference between random and sequential I/O accesses.…”

Section: Spatial Hash-joinmentioning

confidence: 99%

Integration of spatial join algorithms for processing multiple inputs

Mamoulis

Papadias

1999

SIGMOD Rec.

View full text Add to dashboard Cite

Several techniques that compute the join between two spatial datasets have been proposed during the last decade. Among these methods, some consider existing indices for the joined inputs, while others treat datasets with no index, providing solutions for the case where at least one input comes as an intermediate result of another database operator. In this paper we analyze previous work on spatial joins and propose a novel algorithm, called slot index spatial join (SISJ), that efficiently computes the spatial join between two inputs, only one of which is indexed by an R-tree. Going one step further, we show how SISJ and other spatial join algorithms can be implemented as operators in a database environment that joins more than two spatial datasets. We study the differences between relational and spatial multiway joins, and propose a dynamic programming algorithm that optimizes the execution of complex spatial queries.

show abstract

Sibling clustering of tree-based spatial indexes for efficient spatial query processing

Cited by 10 publications

References 21 publications

Sibling‐First Data Organization for Parse‐Free XML Data Processing

Sibling‐First Data Organization for Parse‐Free XML Data Processing

Slot index spatial join

Integration of spatial join algorithms for processing multiple inputs

Contact Info

Product

Resources

About