2019
DOI: 10.1007/978-3-030-30709-7_6
|View full text |Cite
|
Sign up to set email alerts
|

PRTSM: Hardware Data Arrangement Mechanisms for Convolutional Layer Computation on the Systolic Array

Abstract: The systolic array is an array of processing units which share the inner data flow. Since the 2D systolic array fits the operation of multiplication and accumulation (MAC) naturally, there are many groups which use the systolic array to accelerate the computation of DNN (Deep Neural Network). However, the performance of the systolic array is limited by the data bandwidth. Some groups solve this problem with the method of loop tiling and care little about the pixel reuse potential of the convolutional layer. In… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 12 publications
(17 reference statements)
0
2
0
Order By: Relevance
“…As shown in Figure 4, by dividing the array into multiple small arrays into multiple Brams, multiple data can be accessed in one cycle. A [12] Local_a0 [3] Local_a1 [3] Local_a2 [3] Local_a3 [3]…”
Section: Hls Performance Optimizationmentioning
confidence: 99%
See 1 more Smart Citation
“…As shown in Figure 4, by dividing the array into multiple small arrays into multiple Brams, multiple data can be accessed in one cycle. A [12] Local_a0 [3] Local_a1 [3] Local_a2 [3] Local_a3 [3]…”
Section: Hls Performance Optimizationmentioning
confidence: 99%
“…In [12], the authors proposed a convolutional computation unit based on a systolic array. However, the implemented systolic array is relatively small, and the input feature map is also too small, only 7*7*3, which results in the underutilization of the board's resources.…”
Section: Hardware Resource Consumptionmentioning
confidence: 99%