Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating System 2012
DOI: 10.1145/2150976.2151014
|View full text |Cite
|
Sign up to set email alerts
|

SIMD defragmenter

Abstract: Single-instruction multiple-data (SIMD) accelerators provide an energy-efficient platform to scale the performance of mobile systems while still retaining post-programmability. The central challenge is translating the parallel resources of the SIMD hardware into real application performance. In scientific applications, automatic vectorization techniques have proven quite effective at extracting large levels of data-level parallelism (DLP). However, vectorization is often much less effective for media applicati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
5
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 22 publications
(5 citation statements)
references
References 26 publications
0
5
0
Order By: Relevance
“…SIMD extensions have been widely used in desktop for multimedia applications [1]. SIMD extensions offer high performance, high power consumption, and portability and are also suitable for mobile systems [2][3][4].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…SIMD extensions have been widely used in desktop for multimedia applications [1]. SIMD extensions offer high performance, high power consumption, and portability and are also suitable for mobile systems [2][3][4].…”
Section: Introductionmentioning
confidence: 99%
“…VeGen [21] implemented a compilation framework that uses non-SIMD instructions to realize automatic vectorization of nonisomorphic statements. Methods based on hardware special instructions are generally limited by the processor platform and introduce additional operating costs (2) The nonisomorphic statement vectorization method based on expression equivalence transformation mainly uses expression equivalence transformation to convert nonisomorphic statements that satisfy certain conditions into isomorphic statements, thereby creating conditions for the implementation of SLP. For example, the LSLP method [19] analyzes and processes multiple nonisomorphic statements with differences in the order of operations and rearranges the commutative operations and operands based on the commutative law when the conditions are suitable to obtain isomorphic statements.…”
Section: Introductionmentioning
confidence: 99%
“…To accelerate applications efficiently using vector units, a compiler or programmer should find a substantial amount of underlying data parallelism and translate the parallelization potential into a real code to make sufficient use of the vector unit. Although many techniques for improving the quality of the vector code have been proposed [4][5][6][7], the resulting vector resource utilization is still low. Manual vector code optimization is a basic approach; however, it requires a deep understanding of the target vector architectures, and the optimized codes have limited reusability.…”
Section: Introductionmentioning
confidence: 99%
“…Automatic compiler-level vectorization is a promising alternative to manual vector code generation, but it cannot provide sufficient coverage because it can vectorize only 45-71% of loops, even in synthetic benchmarks [8]. Moreover, many vectorized applications do not show sufficient performance gains, as expected, owing to the high data alignment overhead [4,7]. Although many vectorization libraries also utilize vector units by providing more general interfaces, they are still limited in use.…”
Section: Introductionmentioning
confidence: 99%
“…However, vectorization is often much less effective for applications which have low trip count loops, complex control flow, and non-uniform execution behavior [7]. As a result, SIMD lanes remain idle due to insufficient DLP [8]. SIMD widths have been following an upward trend: the 128-bit Streaming SIMD Extensions (SSE) of x86 architectures has been augmented by 256-bit Advanced Vector Extensions (AVX); the new Intel Many Integrated Core (MIC) architecture supports 512-bit SIMD.…”
Section: Introductionmentioning
confidence: 99%