2015
DOI: 10.14778/2777598.2777602
|View full text |Cite
|
Sign up to set email alerts
|

On the surprising difficulty of simple things

Abstract: Partitioning a dataset into ranges is a task that is common in various applications such as sorting [1,6,7,8,9] and hashing [3] which are in turn building blocks for almost any type of query processing. Especially radix-based partitioning is very popular due to its simplicity and high performance over comparison-based versions [6].

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
19
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 22 publications
(19 citation statements)
references
References 6 publications
(16 reference statements)
0
19
0
Order By: Relevance
“…Using a real hash function would make all our algorithms slower by the same constant and thus result in even lower relative overhead of reproducibility. Our implementation of PARTITIONANDAGGREGATE is up to 4 times faster than that of Cieslewicz and Ross [11] because we use the highly-tuned partitioning routine used in other work [9,31,33]. Back-of-the-envelope calculations suggest that we achieve the same performance as the implementations used in [26], as well, thereby ensuring our baseline for GROUPBY matches the state of the art.…”
Section: A Experimental Setupmentioning
confidence: 97%
See 1 more Smart Citation
“…Using a real hash function would make all our algorithms slower by the same constant and thus result in even lower relative overhead of reproducibility. Our implementation of PARTITIONANDAGGREGATE is up to 4 times faster than that of Cieslewicz and Ross [11] because we use the highly-tuned partitioning routine used in other work [9,31,33]. Back-of-the-envelope calculations suggest that we achieve the same performance as the implementations used in [26], as well, thereby ensuring our baseline for GROUPBY matches the state of the art.…”
Section: A Experimental Setupmentioning
confidence: 97%
“…In this case, i.e., if F = 1, PARALLELPARTITION is a no-op that forwards its input. Since modern hardware can run PARTITIONING efficiently only up to a certain fanout [9,26,33], we implementing it recursively using zero or more levels of partitioning i.e., we partition with F = f d for f = 256 and d = 0, 1, . .…”
Section: B High-level Algorithm Structurementioning
confidence: 99%
“…The field of adaptive data stores is a hot research topic with a series of novel approaches, such as the popular database cracking [22,24,28,29], its variations and analysis [30,[58][59][60], advanced partitioning [32,44,61] or adaptive resp. holistic indexing [5,47,57].…”
Section: Related Workmentioning
confidence: 99%
“…It is a fundamental technique for indexing, join processing, and sorting. We investigate two state-of-the-art out-of-place partitioning algorithms [18], which either perform a histogram generation pass beforehand or maintain a linked list of chunks inside the partitions to handle the key distribution. We also test a version enlarging the partitions adaptively using mremap.…”
Section: Structural Flexibilitymentioning
confidence: 99%