2013
DOI: 10.1007/978-3-642-36036-7_17
|View full text |Cite
|
Sign up to set email alerts
|

Scheduling Support for Communicating Parallel Tasks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
4
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 19 publications
0
4
0
Order By: Relevance
“…Applications have been coded and compiled within the ROCm-3.5.0 framework and llvm 12 compiler suite. The CPU code has been compiled combining a C++ NPB-MZ implementation (Dümmler and Rünger 2013) and the original NPB-MZ Fortran implementation (der Wijngaart and Jin 2003) to generate a version compatible with the ROCm implementation of the applications. All experiments have been performed in a system composed of AMD EPYC 7742 @ 2.250 GHz (64 cores and 2 threads/core, totalling 128 threads per node) and 2 × GPU AMD Radeon Instinct MI50 with 32 GB.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Applications have been coded and compiled within the ROCm-3.5.0 framework and llvm 12 compiler suite. The CPU code has been compiled combining a C++ NPB-MZ implementation (Dümmler and Rünger 2013) and the original NPB-MZ Fortran implementation (der Wijngaart and Jin 2003) to generate a version compatible with the ROCm implementation of the applications. All experiments have been performed in a system composed of AMD EPYC 7742 @ 2.250 GHz (64 cores and 2 threads/core, totalling 128 threads per node) and 2 × GPU AMD Radeon Instinct MI50 with 32 GB.…”
Section: Discussionmentioning
confidence: 99%
“…Dümmler and Rünger (2013) evaluated NPB-MZ benchmarks on hybrid CPU + GPU architectures. They decompose the workloads and, using a static scheduling, distribute them among the CPU’s or the GPU.…”
Section: Related Workmentioning
confidence: 99%
“…All applications have been coded combining OpenMP 5.2 and the ROCM‐3.5.0 framework and compiled with llvm 12 . The CPU code has been implemented combining a C++ NPB‐MZ implementation 8 and the original NPB‐MZ Fortran implementation 5 to generate a version compatible with the ROCM implementation of the applications. The input mesh sizes correspond to class D using a total 13GB of memory and 1024 zones for SP‐MZ and BT‐MZ, and 16 zones for LU‐MZ.…”
Section: Discussionmentioning
confidence: 99%
“…NPB‐MZ studies : Dümmler and Rünger 8 evaluated NPB‐MZ benchmarks on hybrid CPU+GPU architectures. Workloads are decomposed and, using a static scheduling, distributed among CPUs and GPUs.…”
Section: Related Workmentioning
confidence: 99%