2011 18th International Conference on High Performance Computing 2011
DOI: 10.1109/hipc.2011.6152715
|View full text |Cite
|
Sign up to set email alerts
|

Porting irregular reductions on heterogeneous CPU-GPU configurations

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
20
0

Year Published

2013
2013
2019
2019

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 18 publications
(20 citation statements)
references
References 25 publications
0
20
0
Order By: Relevance
“…The stencil computations in MPAS often have irregular reduction modes [21] that are not suitable to be parallelized with OpenMP. This is because the input and output vectors may belong to different types of mesh points and the code may have been designed to mix arbitrary two types of unknowns.…”
Section: Regularity-aware Loop Refactoringmentioning
confidence: 99%
“…The stencil computations in MPAS often have irregular reduction modes [21] that are not suitable to be parallelized with OpenMP. This is because the input and output vectors may belong to different types of mesh points and the code may have been designed to mix arbitrary two types of unknowns.…”
Section: Regularity-aware Loop Refactoringmentioning
confidence: 99%
“…Alongside this data dependence based approach, there has also been a large body of work exploring mapping of reductions in a polyhedral setting [26,44] The treatment of more general reduction operations has received less attention. Work has focused on exploitation rather than discovery [18][19][20], examining trade-offs in implementation [52] or exploitation of novel hardware [42,51]. Recent work [16] shows that more complex reductions can be detected, but this is tied to an ad hoc non-portable code generation phase.…”
Section: Related and Future Workmentioning
confidence: 99%
“…The appropriate utilization of hybrid systems, however, typically requires complex software instruments to deal with a number of peculiar aspects of the different processors available. This challenge has motivated a number of languages and runtime frameworks [15], [14], [16], [18], [19], [20], [21], [22], [23], [24], [25], [26], specialize libraries [4], and compiler techniques [27]. …”
Section: Related Workmentioning
confidence: 99%
“…Efficient execution of applications on distributed CPUGPU equipped platforms has been an objective of several projects [23], [24], [25], [26], [22], [29], [30]. Ravi et al [24], [26] proposes techniques for automatic translation of generalized reductions to CPU-GPU environments via compile-time techniques, which are coupled with runtime support to coordinate execution.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation