Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems 2015
DOI: 10.1145/2694344.2694350
|View full text |Cite
|
Sign up to set email alerts
|

Synchronization Using Remote-Scope Promotion

Abstract: Heterogeneous system architecture (HSA) and OpenCL™ define scoped synchronization to facilitate low overhead communication across a subset of threads. Scoped synchronization works well for static sharing patterns, where consumer threads are known a priori. It works poorly for dynamic sharing patterns (e.g., work stealing) where programmers cannot use a faster small scope due to the rare possibility that the work is stolen by a thread in a distant slower scope. This puts programmers in a conundrum: optimize the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
21
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 29 publications
(21 citation statements)
references
References 23 publications
0
21
0
Order By: Relevance
“…Other works [3,33,46] focus on reducing on-chip intra-GPU communication and coherence traffic. These works rely heavily on software assistance to reduce coherence complexity, which requires considerable programmer and compiler involvement (e.g., custom APIs, custom programming models), preventing their applicability to all types of applications.…”
Section: Related Workmentioning
confidence: 99%
“…Other works [3,33,46] focus on reducing on-chip intra-GPU communication and coherence traffic. These works rely heavily on software assistance to reduce coherence complexity, which requires considerable programmer and compiler involvement (e.g., custom APIs, custom programming models), preventing their applicability to all types of applications.…”
Section: Related Workmentioning
confidence: 99%
“…Happens-before is partitioned into global and local versions: global happens-before (ghb) contains global synchronises-with and sequenced-before edges between events on global memory, 29 and local happens-before (lhb) is analogous. 30 See Example 5 for a discussion of the repercussions of this definition of happens-before. Visibility is also split into global (gvis) and local (lvis) versions.…”
Section: Opencl Axiomsmentioning
confidence: 99%
“…The only published compilation scheme of the OpenCL 2.0 memory model of which we are aware is that published by AMD [30] and later formalised by Wickerson et al [41]. The scheme compiles the release/acquire fragment of OpenCL atomics, and its soundness has been verified against an operational model of an AMD GPU [41].…”
Section: Implementability Of the New Sc Axiommentioning
confidence: 99%
See 2 more Smart Citations