2018 IEEE/ACM 5th Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC) 2018
DOI: 10.1109/llvm-hpc.2018.8639402
|View full text |Cite
|
Sign up to set email alerts
|

User-Directed Loop-Transformations in Clang

Abstract: Directives for the compiler such as pragmas can help programmers to separate an algorithm's semantics from its optimization. This keeps the code understandable and easier to optimize for different platforms. Simple transformations such as loop unrolling are already implemented in most mainstream compilers. We recently submitted a proposal to add generalized loop transformations to the OpenMP standard. We are also working on an implementation in LLVM/Clang/Polly to show its feasibility and usefulness. The curre… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
20
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
1
1

Relationship

4
4

Authors

Journals

citations
Cited by 13 publications
(20 citation statements)
references
References 21 publications
0
20
0
Order By: Relevance
“…Moreover, we illustrate two acceleration showcases for a detailed methodology discussion. As future work, we keep investigating advanced performance modeling and compiler optimizations [24] to provide better visual optimization guidance along with an increasingly higher degree of automatic optimizations. Moreover, we aim at incrementally relax methodology constraints such as required loop bounds and target frequency.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Moreover, we illustrate two acceleration showcases for a detailed methodology discussion. As future work, we keep investigating advanced performance modeling and compiler optimizations [24] to provide better visual optimization guidance along with an increasingly higher degree of automatic optimizations. Moreover, we aim at incrementally relax methodology constraints such as required loop bounds and target frequency.…”
Section: Discussionmentioning
confidence: 99%
“…Free of dependencies, the optimal design performs a tiled computation unrolling the internal loop by a factor of 96 and then pipelining it, cyclic partitioning the local memory accordingly. Although with the advances in compiler technologies [24] it will be possible to widen the optimization space and require less manual intervention, the current degree of limitations potentially preventing optimal performance of state-of-the-art DSE engines makes this mixed optimization approach essential. Anyways, the final design has an estimated performance of 1.51 × 10 10 / (red triangle on blue dotted line in Figure 3) carrying a latency estimation error of only 0.000298% with respect the results provided by Vivado HLS.…”
Section: N-body Simulation Test Casementioning
confidence: 99%
“…We implemented a simple demonstration 1 in Python and slightly extended our implementation 2 of loop transformation directives from [3].…”
Section: Methodsmentioning
confidence: 99%
“…We have been working on improved loop transformations for Clang/LLVM [3]. In addition to the loop unrolling, unrolland-jam, vectorization, and loop distribution pragmas already supported by Clang, we added tiling, loop interchange, reversal, array packing, and thread-parallelization directives.…”
Section: Motivationmentioning
confidence: 99%
“…For loop optimizations, we have implemented several new features in LLVM and Clang, and have describe these enhancements in papers ( [65,66] and in several forums directly to the LLVM community (including talks at the LLVM developers' meetings, on the LLVM mailing lists)).…”
Section: Recent Progress For Parallelism We Have Implemented Severalmentioning
confidence: 99%