Proceedings of the 9th International Workshop on Programming Models and Applications for Multicores and Manycores 2018
DOI: 10.1145/3178442.3178444
|View full text |Cite
|
Sign up to set email alerts
|

Combining PREM compilation and ILP scheduling for high-performance and predictable MPSoC execution

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 9 publications
(13 citation statements)
references
References 9 publications
0
13
0
Order By: Relevance
“…Specifically, two memory phases are considered: an acquisition (or load) phase that copies data and instructions from main memory into local memory, and a replication (or unload) phase that copies modified data back to main memory. While the computation phase is always executed on a processor, the memory phases can be either executed on the processor itself [5,6,13,22,26,49,50,53,56,71,72], or on another hardware component [30,31], such as a programmable Direct Memory Access (DMA) module [7,20,61,66]. Works that proposed using a DMA unit to perform the memory transfers [66] can efficiently hide the memory latency by overlapping the execution of a task with the DMA transfer of another task; this leads to considerable improvements in schedulability.…”
Section: Software Solutionsmentioning
confidence: 99%
“…Specifically, two memory phases are considered: an acquisition (or load) phase that copies data and instructions from main memory into local memory, and a replication (or unload) phase that copies modified data back to main memory. While the computation phase is always executed on a processor, the memory phases can be either executed on the processor itself [5,6,13,22,26,49,50,53,56,71,72], or on another hardware component [30,31], such as a programmable Direct Memory Access (DMA) module [7,20,61,66]. Works that proposed using a DMA unit to perform the memory transfers [66] can efficiently hide the memory latency by overlapping the execution of a task with the DMA transfer of another task; this leads to considerable improvements in schedulability.…”
Section: Software Solutionsmentioning
confidence: 99%
“…4.2). Then, we briefly summarize the ILP model from our previous work [4] (Sec. 4.3) and finally introduce the new heuristic (Sec.…”
Section: Schedulingmentioning
confidence: 99%
“…In our previous work [4] we have presented a prototype compiler -capable of transforming regular loops into PREM-compliant code -coupled to a scheduling tool based on an ILP model and capable of optimally scheduling small task graphs. In this paper we significantly extend our previous work along several axes.…”
Section: Introductionmentioning
confidence: 99%
“…The authors of [11] proposed a technique for compiling a GPU kernel into PREM-compliant code. In [17], authors present a compiler based on the LLVM infrastructure that refactors legacy code into PREM code.…”
Section: Related Workmentioning
confidence: 99%
“…Another solution is to rely on a compiler that automates phases separation. For instance, PREM-compliant compilation for the LLVM framework has been proposed in [11,17]. Our approach also tackles PREMcompliant C code generation but starts from a higher level of abstraction than previous approaches.…”
Section: Introductionmentioning
confidence: 99%