2018
DOI: 10.1145/3241039
|View full text |Cite
|
Sign up to set email alerts
|

Distributed Joins and Data Placement for Minimal Network Traffic

Abstract: Network communication is the slowest component of many operators in distributed parallel databases deployed for large-scale analytics. Whereas considerable work has focused on speeding up databases on modern hardware, communication reduction has received less attention. Existing parallel DBMSs rely on algorithms designed for disks with minor modifications for networks. A more complicated algorithm may burden the CPUs but could avoid redundant transfers of tuples across the network. We introduce track join, a n… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
43
0
3

Year Published

2018
2018
2020
2020

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 21 publications
(46 citation statements)
references
References 62 publications
0
43
0
3
Order By: Relevance
“…Query planning. Database research has explored query planning and optimization extensively [4,32,36]. Gigascope performs query partitioning to minimize the data transfer from the capture card to the stream processor [10].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Query planning. Database research has explored query planning and optimization extensively [4,32,36]. Gigascope performs query partitioning to minimize the data transfer from the capture card to the stream processor [10].…”
Section: Related Workmentioning
confidence: 99%
“…Gigascope performs query partitioning to minimize the data transfer from the capture card to the stream processor [10]. Sensor networks have explored the query partitioning problems that are similar to those that Sonata faces [4,27,28,32,36,46]. However, these systems face different optimization problems because they typically involve lower traffic rates and involve special-purpose queries.…”
Section: Related Workmentioning
confidence: 99%
“…Network operators have leveraged the recent advances in the area of scalable streaming data analysis [2,16,24,29,32] to build platforms capable of processing network data at very high rates. The database community has also explored the query optimization problem extensively [3,27]. Gigascope [13] uses query partitioning to minimize the data transfer within the stream processor.…”
Section: Related Workmentioning
confidence: 99%
“…Therefore, retrieving the global skylines from all the distributed local sites with minimum communication overhead in these scenarios will be an important and challenging problem, because of the consideration for the introduction of the bandwidth consumption and network delay. Until recently, there appear some works that target at distributed queries over uncertain data, such as top‐ k dominating and join queries …”
Section: Introductionmentioning
confidence: 99%
“…Given three 2-dimensional interval objects a = {[10,20],[20,40]}, b = {[12,16],[10,30]}, and c = {[8,22],[28,32]} (see…”
mentioning
confidence: 99%