2015
DOI: 10.1016/j.parco.2015.03.001
|View full text |Cite
|
Sign up to set email alerts
|

Design and analysis of scheduling strategies for multi-CPU and multi-GPU architectures

Abstract: International audienceIn this paper, we present a comparison of scheduling strategies for heterogeneous multi-CPU and multi-GPU architectures. We designed and evaluated four scheduling strategies on top of XKaapi runtime: work stealing, data-aware work stealing, locality-aware work stealing, and Heterogeneous Earliest-Finish-Time (HEFT). On a heterogeneous architecture with 12 CPUs and 8 GPUs, we analysed our scheduling strategies with four benchmarks: a BLAS-1 AXPY vector operation, a Jacobi 2D iterative comp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 19 publications
(14 citation statements)
references
References 27 publications
0
14
0
Order By: Relevance
“…It therefore suggests to constrain the dsyrk and dtrsm tasks to run exclusively on GPUs. Performance gains when constraining some tasks to GPUs were already reported by Lima et al 51 However, their results were achieved using scheduler hints provided by programmer annotations. In our case, the suggestion of when and which tasks to constrain to GPUs is inferred from the solution of the ABE without relying on programmer's knowledge about task's architecture affinity.…”
Section: Initial Motivationmentioning
confidence: 93%
“…It therefore suggests to constrain the dsyrk and dtrsm tasks to run exclusively on GPUs. Performance gains when constraining some tasks to GPUs were already reported by Lima et al 51 However, their results were achieved using scheduler hints provided by programmer annotations. In our case, the suggestion of when and which tasks to constrain to GPUs is inferred from the solution of the ABE without relying on programmer's knowledge about task's architecture affinity.…”
Section: Initial Motivationmentioning
confidence: 93%
“…In this paper, we address an intermediate setting, where tasks are independent, but share input data, and we analyze both makespan and communication performance. More recently, a study comparing different schedulers have been carried out in the context of dense linear algebra factorizations on heterogeneous systems [22]. Although, this study is closely related to the work we present in the present paper, it doesn't tackle neither the matrix product, nor the static (resp.…”
Section: Related Workmentioning
confidence: 97%
“…We implemented extensions in the OpenMP runtime developed in our team, LIBKOMP [5,3], which is based on the XKAAPI [1,9] runtime system. XKAAPI is a task-based runtime system, using workstealing as a general scheduling strategy.…”
Section: Extension Of the Task Scheduler To Support Affinitymentioning
confidence: 99%
“…The way XKAAPI enables ready tasks and steals them. The scheduling framework in XKAAPI [1,9] relies on virtual functions for selecting a victim and selecting a place to push a ready task. When a processor becomes idle, the runtime system calls a function to browse the topology to find a locality domain, and steal a task from its task queue.…”
Section: Extension Of the Task Scheduler To Support Affinitymentioning
confidence: 99%