2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PHD Forum 2011
DOI: 10.1109/ipdps.2011.281
|View full text |Cite
|
Sign up to set email alerts
|

DAGuE: A Generic Distributed DAG Engine for High Performance Computing

Abstract: Abstract-The frenetic development of the current architectures places a strain on the current state-of-the-art programming environments. Harnessing the full potential of such architectures has been a tremendous task for the whole scientific computing community.We present DAGuE a generic framework for architecture aware scheduling and management of micro-tasks on distributed many-core heterogeneous architectures. Applications we consider can be represented as a Direct Acyclic Graph of tasks with labeled edges d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

4
174
0
2

Year Published

2013
2013
2018
2018

Publication Types

Select...
6

Relationship

1
5

Authors

Journals

citations
Cited by 157 publications
(180 citation statements)
references
References 20 publications
(10 reference statements)
4
174
0
2
Order By: Relevance
“…In terms of speedup values the above improvements imply an additional speedup improvement of 1.28 and 1.23 for our two different test cards respectively. These achievements are comparable to the ones presented in other recent works in the literature [14,[16][17]. Furthermore, the value θ that leads to the larger improvement for 4 CPU-cores is 0.8 for the GTX 760 and 0.9 for the GTX TitanX GPU, whereas the corresponding values for 8 CPU-cores become 0.6 and 0.8 respectively.…”
Section: ) Performance Of the Hybrid Openmp+cuda Schemesupporting
confidence: 87%
See 4 more Smart Citations
“…In terms of speedup values the above improvements imply an additional speedup improvement of 1.28 and 1.23 for our two different test cards respectively. These achievements are comparable to the ones presented in other recent works in the literature [14,[16][17]. Furthermore, the value θ that leads to the larger improvement for 4 CPU-cores is 0.8 for the GTX 760 and 0.9 for the GTX TitanX GPU, whereas the corresponding values for 8 CPU-cores become 0.6 and 0.8 respectively.…”
Section: ) Performance Of the Hybrid Openmp+cuda Schemesupporting
confidence: 87%
“…It also includes an automated distribution of the computational load on the CPU and the GPU, and achieves very good performance mainly in audio-processing systems. In [14,[16][17] more recent, relevant approaches are presented in the field of linear algebra and systems. In [17] an additional speedup of up to 1.25 is achieved (compared to the GPU-only implementation) for the parallel execution of the conjugate gradient method, whereas in [14,16] the CPU/GPU collaboration schemes in the field of linear algebra achieved a speedup ranging from 1.15 up to 1.35 for different sizes and types of problems.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations