2014 47th Annual IEEE/ACM International Symposium on Microarchitecture 2014
DOI: 10.1109/micro.2014.14
|View full text |Cite
|
Sign up to set email alerts
|

Specializing Compiler Optimizations through Programmable Composition for Dense Matrix Computations

Abstract: General purpose compilers aim to extract the best average performance for all possible user applications. Due to the lack of specializations for different types of computations, compiler attained performance often lags behind those of the manually optimized libraries. In this paper, we demonstrate a new approach, programmable composition, to enable the specialization of compiler optimizations without compromising their generality. Our approach uses a single pass of sourcelevel analysis to recognize a common pa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 27 publications
0
3
0
Order By: Relevance
“…Several approaches can fully automatically optimize MMA[×, +] and obtain high-performance code that outperforms expert-tuned implementations. The POET optimization library (Yi et al 2014) and AUGEM framework (Wang et al 2013) use annotations and templates of sequential code, respectively, written by domain experts to guide general-purpose compilers to produce optimized MMA[×, +] kernels from specifically prepared code. The Portable Compiler Approach (POCA) (Su et al 2017) generates an optimized micro-kernel based on LLVM IR representing MMA [×, +] and subsequent domain-specific but architecture-independent optimizations of its micro-kernel.…”
Section: Automatic Optimization Of Mma[× +]mentioning
confidence: 99%
“…Several approaches can fully automatically optimize MMA[×, +] and obtain high-performance code that outperforms expert-tuned implementations. The POET optimization library (Yi et al 2014) and AUGEM framework (Wang et al 2013) use annotations and templates of sequential code, respectively, written by domain experts to guide general-purpose compilers to produce optimized MMA[×, +] kernels from specifically prepared code. The Portable Compiler Approach (POCA) (Su et al 2017) generates an optimized micro-kernel based on LLVM IR representing MMA [×, +] and subsequent domain-specific but architecture-independent optimizations of its micro-kernel.…”
Section: Automatic Optimization Of Mma[× +]mentioning
confidence: 99%
“…We implemented our method in the OpenBLAS [3] library and evaluated it on Phytium 2000+, an emerging high-performance many-core processor based on Arm's AArch64 architecture. We restrict our evaluation to DGEMM, as in prior work [10][11][12], for two reasons. First, the basic idea of the hybrid-grained load-balancing method applies to other variants of GEMM such as SGEMM, CGEMM and ZGEMM.…”
Section: Resultsmentioning
confidence: 99%
“…ATLAS [1] adopts the auto-tuning method to automatically generate kernels with different parameters in C and find the best-performing one by running them on the actual computing system. POET [12,16,17] and AUGEM [11] use a directive-based programming approach. POCA [14] is a compiler-based approach which generates and optimize kernels automatically and portably.…”
Section: Related Workmentioning
confidence: 99%