Proceedings of the Tenth International Symposium on Code Generation and Optimization 2012
DOI: 10.1145/2259016.2259037
|View full text |Cite
|
Sign up to set email alerts
|

Auto-generation and auto-tuning of 3D stencil codes on GPU clusters

Abstract: This paper develops and evaluates search and optimization techniques for auto-tuning 3D stencil (nearest-neighbor) computations on GPUs. Observations indicate that parameter tuning is necessary for heterogeneous GPUs to achieve optimal performance with respect to a search space. Our proposed framework takes a most concise specification of stencil behavior from the user as a single formula, auto-generates tunable code from it, systematically searches for the best configuration and generates the code with optima… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
68
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 106 publications
(69 citation statements)
references
References 20 publications
1
68
0
Order By: Relevance
“…Stencil-specific code generators have been used to generate and autotune stencil code on GPUs [10,37]. These techniques target shared memory.…”
Section: Compiler Optimizations Dsls and Programming Models For Stenmentioning
confidence: 99%
“…Stencil-specific code generators have been used to generate and autotune stencil code on GPUs [10,37]. These techniques target shared memory.…”
Section: Compiler Optimizations Dsls and Programming Models For Stenmentioning
confidence: 99%
“…A description of the stencil is then stored as a Stencil object, which can be used by other modules for transformation purposes. Stencil description is typically adopted by domain-specific languages (DSLs) [23,47] to deal with this problem. However, while DSLs typically require the user to explicitly define the stencil, the Panda compiler is capable of detecting it automatically, like several existing tools [3,12,42,8].…”
Section: Overviewmentioning
confidence: 99%
“…For small search spaces, an exhaustive search was used to determine the best run-time parameters [23], whereas for a larger search space, methods like dynamic programming or stochastic search can be used [17].…”
Section: Related Workmentioning
confidence: 99%