2017
DOI: 10.1002/cpe.4190
|View full text |Cite
|
Sign up to set email alerts
|

Piecewise holistic autotuning of parallel programs with CERE

Abstract: Summary Current architecture complexity requires fine tuning of compiler and runtime parameters to achieve best performance. Autotuning substantially improves default parameters in many scenarios, but it is a costly process requiring long iterative evaluations. We propose an automatic piecewise autotuner based on CERE (Codelet Extractor and REplayer). CERE decomposes applications into small pieces called codelets: Each codelet maps to a loop or to an OpenMP parallel region and can be replayed as a standalone p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 8 publications
(11 citation statements)
references
References 39 publications
0
11
0
Order By: Relevance
“…Each of the previous works is able to achieve the best performance for some of the benchmarks, but none of them explores enough of the search space to achieve the best for all, as we do. (PPP=Page Placement Policy, PND=Page NUMA Degree, TPP=Thread Placement Policy, NT=Number of Threads, TND=Thread NUMA Degree) Previous work PPP: [6,10,23,24,26], TPP/NT/TND: [12,17,27,29,30,34], PPP/TPP: [11]. Our work includes all optimizations and performs significantly better.…”
Section: Codelet Search Speedmentioning
confidence: 96%
See 3 more Smart Citations
“…Each of the previous works is able to achieve the best performance for some of the benchmarks, but none of them explores enough of the search space to achieve the best for all, as we do. (PPP=Page Placement Policy, PND=Page NUMA Degree, TPP=Thread Placement Policy, NT=Number of Threads, TND=Thread NUMA Degree) Previous work PPP: [6,10,23,24,26], TPP/NT/TND: [12,17,27,29,30,34], PPP/TPP: [11]. Our work includes all optimizations and performs significantly better.…”
Section: Codelet Search Speedmentioning
confidence: 96%
“…Second, we use a codelet warm-up that executes the full codelet before taking performance measurements. This approach can be overly-optimistic [27], leading to better performance than expected. We further discuss these limitations in Section 6.…”
Section: Codelet Prediction Accuracymentioning
confidence: 99%
See 2 more Smart Citations
“…In their paper Piecewise holistic autotuning of parallel programs with CERE , the authors Mihail Popov, Chadi Akel, Yohan Chatelain, William Jalby, and Pablo de Oliveira Castro describe how to autotune codelets that have been produced by the Codelet Extractor and Replayer CERE. Autotuning a set of codelets incurs lower costs than autotuning the entire program, and CERE helps in setting optimization options appropriately.…”
mentioning
confidence: 99%