2005
DOI: 10.1007/11546924_60
|View full text |Cite
|
Sign up to set email alerts
|

An Optimal Skew-insensitive Join and Multi-join Algorithm for Distributed Architectures

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2007
2007
2022
2022

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 13 publications
(14 citation statements)
references
References 9 publications
0
14
0
Order By: Relevance
“…To avoid the slowdown usually caused by AVS and the imbalance of the size of local joins processed by the standard join algorithms, an appropriate treatment for high attribute frequencies is needed (Bamha and Hains, 1999;Bamha and Hains, 2000;Bamha, 2005).…”
Section: Phase 3: Creating the Communication Templatesmentioning
confidence: 99%
See 2 more Smart Citations
“…To avoid the slowdown usually caused by AVS and the imbalance of the size of local joins processed by the standard join algorithms, an appropriate treatment for high attribute frequencies is needed (Bamha and Hains, 1999;Bamha and Hains, 2000;Bamha, 2005).…”
Section: Phase 3: Creating the Communication Templatesmentioning
confidence: 99%
“…At the end of steps 4.a and 4.b, each processor i, has local knowledge of how the tuples of semi-joins (Bamha, 2005), we can deduce that the tuples of…”
Section: B Redistribution Of Tuples With Valuesmentioning
confidence: 99%
See 1 more Smart Citation
“…However, these algorithms cannot solve load imbalance problem as they base their routing decisions on incomplete or statistical information. On the contrary, the algorithms we presented in (Bamha and Hains, 1999;Bamha and Hains, 2000;Bamha, 2005) for treating queries involving one join operation use a total data-distribution information in the form of histograms. The parallel cost model we apply allows us to guarantee that histogram management has a negligible cost when compared to the efficiency gains it provides to reduce the communication cost and to avoid load imbalance between processors.…”
Section: Introductionmentioning
confidence: 99%
“…The main difficulty in such applications is that the result of these analytical queries must be obtained interactively (Datta et al, 1998;Tsois and Sellis, 2003) despite the huge volume of data in warehouses and their rapid growth especially in OLAP systems (Datta et al, 1998). For this reason, parallel processing of these queries is highly recommended in order to obtain acceptable response time (Bamha, 2005). Research has shown that join, which is one of the most expensive operations in DBMS, is parallelizable with near-linear speed-up only in ideal cases (Bamha and Hains, 2000).…”
Section: Introductionmentioning
confidence: 99%