2016
DOI: 10.12694/scpe.v17i4.1207
|View full text |Cite
|
Sign up to set email alerts
|

AQsort: Scalable Multi-Array In-Place Sorting with OpenMP

Abstract: A new multi-threaded variant of the quicksort algorithm called AQsort and its C++/OpenMP implementation are presented. AQsort operates in place and was primarily designed for high-performance computing (HPC) runtime environments. It can work with multiple arrays at once; such a functionality is frequently required in HPC and cannot be accomplished with standard C pointer-based or C++ iterator-based approach. An extensive study is provided that evaluates AQsort experimentally and compares its performance with m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

4
14
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
2
2

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(18 citation statements)
references
References 25 publications
4
14
0
Order By: Relevance
“…Step 2 (Sorting of Morton codes): The parallelization of this step is straightforward by using any parallel in-place sort (e.g., sort method from std::algorithm [20] or AQsort [21] parallel add to every nonzero element its Morton code; 3: parallel sort nonzero elements on its Morton code; 4: len = N/th; 5: start of parallel block 6: tid = get tid of current thread (); 7: if tid = 0 then start ← tid · len; 15: for j ← 1, c max do diff ← XOR(new , old ); 20: old ← new ; 21: k ← round up(Highest1(diff )/2); 22: for j ← 1, k do …”
Section: Parallelization 1) Sw Technologiesmentioning
confidence: 99%
“…Step 2 (Sorting of Morton codes): The parallelization of this step is straightforward by using any parallel in-place sort (e.g., sort method from std::algorithm [20] or AQsort [21] parallel add to every nonzero element its Morton code; 3: parallel sort nonzero elements on its Morton code; 4: len = N/th; 5: start of parallel block 6: tid = get tid of current thread (); 7: if tid = 0 then start ← tid · len; 15: for j ← 1, c max do diff ← XOR(new , old ); 20: old ← new ; 21: k ← round up(Highest1(diff )/2); 22: for j ← 1, k do …”
Section: Parallelization 1) Sw Technologiesmentioning
confidence: 99%
“…AQsort is our previous parallel quicksort implementation build upon OpenMP. 1 Its main feature is that-on the contrary to the other implementations-it is capable of working with a user-provided function for swapping elements. This slightly reduces optimization options, but, effectively, allows sorting multiple datasets (such as arrays) at once.…”
mentioning
confidence: 99%
“…To some degree this is affected by the overhead imposed by the high-level library used in the programming effort. We can still draw however some reliable conclusions and reason about the performance of these implementations using the MBSP model, thus making MBSP useful and usable.Integer sorting on multicores and GPUs can be realized by traditional distribution-specific algorithms such as radix-sort [3,12,25,28], or variants of it that use fewer rounds of the baseline count-sort implementation provided additional information about key values is available [6,39].Other approaches include algorithms that use specialized hardware or software features of a particular multicore architecture [4,6,22,25]. Comparison-based algorithms have also been used with some obvious tweaks: use of deterministic regular sampling sorting [34] that utilizes serial radix-sort for local sorting [8,9,10] or use other methods for local sorting [38,3,5,6,22].…”
mentioning
confidence: 99%
“…Integer sorting on multicores and GPUs can be realized by traditional distribution-specific algorithms such as radix-sort [3,12,25,28], or variants of it that use fewer rounds of the baseline count-sort implementation provided additional information about key values is available [6,39].…”
mentioning
confidence: 99%
See 1 more Smart Citation