The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2006
DOI: 10.1145/1133255.1133997
|View full text |Cite
|
Sign up to set email alerts
|

Auto-vectorization of interleaved data for SIMD

Abstract: Most implementations of the Single Instruction Multiple Data (SIMD) model available today require that data elements be packed in vector registers. Operations on disjoint vector elements are not supported directly and require explicit data reorganization manipulations. Computations on non-contiguous and especially interleaved data appear in important applications, which can greatly benefit from SIMD instructions once the data is reorganized properly. Vectorizing such computations efficiently is therefore an am… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
68
0
3

Year Published

2007
2007
2020
2020

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 75 publications
(71 citation statements)
references
References 23 publications
0
68
0
3
Order By: Relevance
“…Using the simple canonical approach from Section 2.2, we might generate loads and stores of the same data more than once. Similar to Nuzman et al [2006], we exploit this spatial locality by allowing multiple accesses to share mapped register sets when interleaving/deinterleaving, reducing the number of memory operations in the vectorized loop.…”
Section: Exploiting Spatial Locality: Grouping Multiple Interleaved Amentioning
confidence: 99%
See 1 more Smart Citation
“…Using the simple canonical approach from Section 2.2, we might generate loads and stores of the same data more than once. Similar to Nuzman et al [2006], we exploit this spatial locality by allowing multiple accesses to share mapped register sets when interleaving/deinterleaving, reducing the number of memory operations in the vectorized loop.…”
Section: Exploiting Spatial Locality: Grouping Multiple Interleaved Amentioning
confidence: 99%
“…Investigations of bottlenecks in SIMD programs have identified non-unit-stride memory access patterns as a particular concern [Talla et al 2003;Maleki et al 2011;Schaub et al 2015]. Nuzman et al [2006] proposed an auto-vectorization algorithm for interleaved data access patterns where the stride is a power-of-two. Given a loop with such an access pattern, the algorithm generates extremely efficient vectorized code by directly exploiting the structure of the access pattern.…”
Section: Introductionmentioning
confidence: 99%
“…There has been significant recent work in generating effectice code for SIMD vector instruction sets in the presence of hardware alignment and stride constraints as described in [12,44,45,31,13]. The difficulties of optimizing for a wide range of SIMD vector architectures are discussed in [29,14].…”
Section: Related Workmentioning
confidence: 99%
“…The difficulties of optimizing for a wide range of SIMD vector architectures are discussed in [29,14]. In addition, several other works have addressed the exploitation of SIMD instruction sets [22,24,23,30,32,31,28]. All of these works only address SIMD hardware alignment issues.…”
Section: Related Workmentioning
confidence: 99%
“…Various impactful techniques have been applied to automatically generate SIMD code and to address the difficulties during vectorizing such as data permutations [8], interleaved data [9], etc. However, the optimizing approaches employed by those compilers still cannot drastically eliminate the irregular and non-aligned obstacles.…”
Section: Introductionmentioning
confidence: 99%