2021
DOI: 10.1007/978-3-030-85665-6_31
|View full text |Cite
|
Sign up to set email alerts
|

Exploiting Co-execution with OneAPI: Heterogeneity from a Modern Perspective

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 11 publications
(4 citation statements)
references
References 21 publications
0
2
0
Order By: Relevance
“…Coexecutor Runtime is presented, extending the work and preliminary results of the conference paper [29]. The key innovations are the high level API, increasing the abstraction but maintaining its compatibility and extensibility with SYCL; an efficient architectural design focused on preserving and reusing as many oneAPI primitives as possible while extending its functionality; and as far as we know, it is the first co-execution runtime for Intel oneAPI.…”
Section: Introductionmentioning
confidence: 84%
See 1 more Smart Citation
“…Coexecutor Runtime is presented, extending the work and preliminary results of the conference paper [29]. The key innovations are the high level API, increasing the abstraction but maintaining its compatibility and extensibility with SYCL; an efficient architectural design focused on preserving and reusing as many oneAPI primitives as possible while extending its functionality; and as far as we know, it is the first co-execution runtime for Intel oneAPI.…”
Section: Introductionmentioning
confidence: 84%
“…The experiments to validate the Coexecutor Runtime [29] ( https://github.com/oneAPIscheduling/CoexecutorRuntime, accessed on 28 September 2021) were carried out in two nodes, labeled Desktop and DevCloud. Desktop was a computer with an Intel Core i5-7500 Kaby Lake architecture processor, with four cores at 3400 MHz, one thread per core and three cache levels.…”
Section: Methodsmentioning
confidence: 99%
“…There are situations in which an algorithm performs better in one type of problem or device, and in other cases another one behaves much better. For example, an integrated GPU that supports computecommunication overlap via multiple queues, when faced with a program behavior like NBody, can benefit from algorithms that divide the load into many small packets [5,17], while a discrete accelerator faced with the execution of many short-lived kernels generally cannot amortize the management overhead, and is better suited to algorithms that exploit very large packets [18][19][20]. For this reason, it is necessary to provide an appropriate and more sophisticated load balancing algorithm that take into account the context of the simulation and the runtime system.…”
Section: Motivationmentioning
confidence: 99%
“…There have been different works related to combining heterogeneous programming models and technologies [1][2][3][4][5][6], but they usually provide explicit code inputs, isolation of technologies by tasks, focus only on CPU-GPU distribution or use non-OpenCL-based languages. Some works focus on providing load distribution for HPC simulation environments [1,[7][8][9][10][11][12][13], but most focus on distributed technologies in combination with shared memory.…”
Section: Introductionmentioning
confidence: 99%