2021
DOI: 10.1109/tii.2020.3015833
|View full text |Cite
|
Sign up to set email alerts
|

Comparative Analysis of Processor-FPGA Communication Performance in Low-Cost FPSoCs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
6
0
1

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 13 publications
(7 citation statements)
references
References 28 publications
0
6
0
1
Order By: Relevance
“…A read DMAC for the IA (RI-DMAC), a read DMAC for filter weights (RW-DMAC) and a write DMAC for the OA (WO-DMAC) are used. In addition, the processor core is responsible for synchronizing the DMACs in the hardware accelerator (i.e., for the starting and stopping of the DMAC execution [51][52][53]). Finally, the processor core makes it possible to reconfigure the hardware accelerator according to the number of DMACs.…”
Section: System Under Considerationmentioning
confidence: 99%
“…A read DMAC for the IA (RI-DMAC), a read DMAC for filter weights (RW-DMAC) and a write DMAC for the OA (WO-DMAC) are used. In addition, the processor core is responsible for synchronizing the DMACs in the hardware accelerator (i.e., for the starting and stopping of the DMAC execution [51][52][53]). Finally, the processor core makes it possible to reconfigure the hardware accelerator according to the number of DMACs.…”
Section: System Under Considerationmentioning
confidence: 99%
“…It is assumed that the whole computation-intensive task (e.g., the convolutional layer in the case of CNN) is offloaded onto the accelerator and that the processor core is responsible for synchronizing each of the DMACs, as assumed in many other studies, e.g., [30][31][32]. More specifically, the processor can start and stop the DMAC execution appropriately by reading from or writing to the control registers of the accelerator, for example, through the AMBA AXI interface.…”
Section: System Under Considerationmentioning
confidence: 99%
“…If the communication time is longer than the computation time (as assumed in the figure), it is referred to as communication-limited. In the communication-limited case, the performance of a DMAcontrolled accelerator tends to be determined by the communication bandwidth, which is in turn determined by DRAM latency and bus protocol overhead.It is assumed that the whole computation-intensive task (e.g., the convolutional layer in the case of CNN) is offloaded onto the accelerator and that the processor core is responsible for synchronizing each of the DMACs, as assumed in many other studies, e.g.,[30][31][32]. More specifically, the processor can start and stop the DMAC execution appropriately by reading from or writing to the control registers of the accelerator, for example, through the AMBA AXI interface.…”
mentioning
confidence: 99%
“…Processor core In addition, the processor core is responsible for synchronizing the DMACs in the hardware accelerator, i.e., starting and stopping the DMAC execution appropriately [24][25][26]. Recalling that the processor core is also responsible for setting the source/destination addresses of each DMAC, the bank allocations can also be reconfigured by letting the processor core allocate the DMAC a set of banks.…”
Section: On-chip Off-chipmentioning
confidence: 99%