2021
DOI: 10.1109/access.2021.3074171
|View full text |Cite
|
Sign up to set email alerts
|

A Highly-Efficient and Tightly-Connected Many-Core Overlay Architecture

Abstract: The technology advances of CPU (Central Processing Unit) architecture alternate between generalization and specialization. In the past decade, the general performance has been enhanced while addressing the new brick walls that include power, memory, and ILP (Instruction-Level Parallelism). Thus, it will enter into the era of specialization called adaptable ISA (Instruction Set Architecture) for target applications. Reconfigurable devices such as FPGAs (Field Programmable Gate Array) can offer a solution if th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(3 citation statements)
references
References 14 publications
0
3
0
Order By: Relevance
“…2GRVI Phalanx [59] extends that to more than 1000 64-bit RISC-V cores on a Xilinx VU37P. The DRAGON architecture [60] is a 64-bit custom-ISA cluster-based multiprocessor that scales to 144 cores on a Xilinx VU37P. In contrast, the accelerator in HEROv2 is not specialized for FPGAs but has identical RTL code as for ASIC tapeouts.…”
Section: Related Workmentioning
confidence: 99%
“…2GRVI Phalanx [59] extends that to more than 1000 64-bit RISC-V cores on a Xilinx VU37P. The DRAGON architecture [60] is a 64-bit custom-ISA cluster-based multiprocessor that scales to 144 cores on a Xilinx VU37P. In contrast, the accelerator in HEROv2 is not specialized for FPGAs but has identical RTL code as for ASIC tapeouts.…”
Section: Related Workmentioning
confidence: 99%
“…The survey done in [8] highlights the fact that many overlay tools have been developed in both classes; we can mention DeCO [9] for SC overlays; GRVI Phalanx [10], and reMORPH [11] for TM overlays. The work done in [12] highlights a list of some previous parallel processing overlays.…”
Section: Introductionmentioning
confidence: 99%
“…All of the tools presented in [12] integrate parallel computing models; however, we have not encountered an FPGA overlay tool that addresses the design of MPI parallel applications without the intervention of CPUs. MPI parallelization compared to OpenMP has advantages of no parallelization overhead, except for the explicit communications that have been added to the program once the MPI parallel program has been configured; moreover, all aspects of MPI programs are generally executed in parallel, unlike OpenMP [13].…”
Section: Introductionmentioning
confidence: 99%