Proceedings of the 1st Workshop on AutotuniNg and aDaptivity AppRoaches for Energy Efficient HPC Systems 2017
DOI: 10.1145/3152821.3152877
|View full text |Cite
|
Sign up to set email alerts
|

Autotuning of OpenCL Kernels with Global Optimizations

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
17
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
2

Relationship

5
2

Authors

Journals

citations
Cited by 19 publications
(17 citation statements)
references
References 19 publications
0
17
0
Order By: Relevance
“…They may select one of the predefined variants of a tuned function [34], or generate and compile implementations according to the values of the tuning parameters. We distinguish between compiler-based tuning, where the space of code transformation is generated automatically [37,43,39] and user-defined code optimization parameters autotuning [35,13,36,15]. User-defined code optimization parameters tuning requires expert programmers to identify and implement tuning possibilities in the source code manually (e. g., by using preprocessor macros).…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…They may select one of the predefined variants of a tuned function [34], or generate and compile implementations according to the values of the tuning parameters. We distinguish between compiler-based tuning, where the space of code transformation is generated automatically [37,43,39] and user-defined code optimization parameters autotuning [35,13,36,15]. User-defined code optimization parameters tuning requires expert programmers to identify and implement tuning possibilities in the source code manually (e. g., by using preprocessor macros).…”
Section: Related Workmentioning
confidence: 99%
“…The KTT API has been derived from the CLTune project [35], so it is very similar to CLTune when we use it for offline tuning. Additionally, KTT API allows for tuning compositions of multiple kernels, tuning of how kernels are called from host code [15], and novel features for dynamic tuning.…”
Section: Architecture Of the Kernel Tuning Toolkitmentioning
confidence: 99%
See 1 more Smart Citation
“…The benchmarks are composed of important computational kernels spanning across multiple application domains: 3D Fourier Reconstruction 9 and 2D convolution (adopted from CLTune 3 ) are image processing kernels, BiCG, GEMM (adopted from CLTune 3 ), GEMM batched, Matrix Transpose and Reduction 10 are linear algebra kernels, direct Coulomb summation 10 is a computational chemistry kernel, N‐body (autotuned version of NVIDIA CUDA SDK sample) and Hotspot (based on implementation from Rodinia benchmark 11 ) are differential equation solvers. These benchmarks autotune a variety of tuning parameters, changing implementation properties such as work‐group size, cache blocking, thread coarsening, explicit caching in local memory, loop unrolling, explicit vectorization or data layout optimization (ie, array of structures vs structure of arrays).…”
Section: Benchmark Setmentioning
confidence: 99%
“…We have implemented several optimization strategies, which may easily interfere with each other. Thus, we have used a Kernel Tuning Toolkit (KTT) (Filipovič et al, 2017), to automatically search for the optimal combination of optimizations.…”
Section: Gpu Implementationmentioning
confidence: 99%