2013
DOI: 10.1145/2450136.2450138
|View full text |Cite
|
Sign up to set email alerts
|

A Transformation Framework for Optimizing Task-Parallel Programs

Abstract: Task parallelism has increasingly become a trend with programming models such as OpenMP 3.0, Cilk, Java Concurrency, X10, Chapel and Habanero-Java (HJ) to address the requirements of multicore programmers. While task parallelism increases productivity by allowing the programmer to express multiple levels of parallelism, it can also lead to performance degradation due to increased overheads. In this article, we introduce a transformation framework for optimizing task-parallel programs with a focus on task creat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
19
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 31 publications
(19 citation statements)
references
References 45 publications
0
19
0
Order By: Relevance
“…These HJ ports are not new to this paper; they have also been used in earlier performance evaluation, e.g. [3] and [26]. Also, the HJ versions of these benchmarks are fundamentally the same as the OpenMP versions; the primary change (in addition to translating C code to Java code) is that the OpenMP 3.0 task, taskwait and critical directives were replaced by async, finish and isolated statements in HJ, respectively.…”
Section: Methodsmentioning
confidence: 99%
“…These HJ ports are not new to this paper; they have also been used in earlier performance evaluation, e.g. [3] and [26]. Also, the HJ versions of these benchmarks are fundamentally the same as the OpenMP versions; the primary change (in addition to translating C code to Java code) is that the OpenMP 3.0 task, taskwait and critical directives were replaced by async, finish and isolated statements in HJ, respectively.…”
Section: Methodsmentioning
confidence: 99%
“…Such clock objects are replaced with instantiations of specialized clock classes that take advantage of the above properties. Nandivada et al presented techniques to reduce the overheads of X10 clock (and HJ phaser) operations by chunking parallel loops with synchronization operations. Feautrier et al proposed a technique to transform code written using clocks‐async‐finish abstractions to code that does not use clocks.…”
Section: Related Workmentioning
confidence: 99%
“…Parallelization of place-change operations. In Figure 12 The dependencies among S1, S2, and E1 are computed using standard techniques [20]. Interestingly, say, "p" is the number of places and "k" is the size of Distribution D, then the number of remote-communications performed by the code compiled using the synchronization-elimination and place-level strip-mining techniques of Barik et al [6] are "2k" and "p + k," respectively.…”
Section: At-pruning: Reducing the Overheads Of Place Change Operationsmentioning
confidence: 99%