2020
DOI: 10.3390/s20195558
|View full text |Cite
|
Sign up to set email alerts
|

An Accelerator Design Using a MTCA Decomposition Algorithm for CNNs

Abstract: Due to the high throughput and high computing capability of convolutional neural networks (CNNs), researchers are paying increasing attention to the design of CNNs hardware accelerator architecture. Accordingly, in this paper, we propose a block parallel computing algorithm based on the matrix transformation computing algorithm (MTCA) to realize the convolution expansion and resolve the block problem of the intermediate matrix. It enables high parallel implementation on hardware. Moreover, we also provide a sp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 23 publications
0
4
0
Order By: Relevance
“…In the experiments, we assume the clock frequency is 1 GHz. In addition, we assume that the CNN accelerator is in the SIMD architecture [9][10][11][12][13][14][16][17][18]23,26,27] . In Cnvlutin [23], Cambricon-X [9] and Dual Indexing [26], the number of PEs is 16.…”
Section: Experiments Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…In the experiments, we assume the clock frequency is 1 GHz. In addition, we assume that the CNN accelerator is in the SIMD architecture [9][10][11][12][13][14][16][17][18]23,26,27] . In Cnvlutin [23], Cambricon-X [9] and Dual Indexing [26], the number of PEs is 16.…”
Section: Experiments Resultsmentioning
confidence: 99%
“…To exploit the parallelism in CNNs, many CNN accelerators [9][10][11][12][13][14][16][17][18]23,26,27] are designed based on the single-instruction-multiple-data (SIMD) architecture. Note that the core of convolution operation is multiplication and accumulation.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Since AlexNet achieved outstanding achievements in the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC), a lot of research teams have been devoted to the development of convolutional neural networks (CNNs) with well-known research advances such as ZFNet, GoogleNet, VGG, ResNet, etc. Owing to the increasing demand for real-time applications, an efficient dedicated hardware computation unit (i.e., a CNN accelerator) is required to support the calculations [ 1 , 2 , 3 , 4 , 5 , 6 ] in the inference process. Moreover, for edge devices, low power is also an important concern [ 7 , 8 , 9 ].…”
Section: Introductionmentioning
confidence: 99%