2020 IEEE 31st International Conference on Application-Specific Systems, Architectures and Processors (ASAP) 2020
DOI: 10.1109/asap49362.2020.00016
|View full text |Cite
|
Sign up to set email alerts
|

Array Aware Training/Pruning: Methods for Efficient Forward Propagation on Array-based Neural Network Accelerators

Abstract: Due to the increase in the use of large-sized Deep Neural Networks (DNNs) over the years, specialized hardware accelerators such as Tensor Processing Unit and Eyeriss have been developed to accelerate the forward pass of the network. The essential component of these devices is an array processor which is composed of multiple individual compute units for efficiently executing Multiplication and Accumulation (MAC) operation. As the size of this array limits the amount of DNN processing of a single layer, the com… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
1
0
2

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 16 publications
0
1
0
2
Order By: Relevance
“…Salah satu penerapan dari neural network adalah Deep Neural Network atau DNN. Deep Neural Network adalah neural network yang tersusun dari layer yang jumlahnya lebih dari satu [11]. Penelitian yang menggunakan DNN dilakukan oleh [12] untuk menganalisa sentimen pada Twitter berbahasa Indonesia mengenai institusi pemerintahan dan tokoh pemerintahan.…”
Section: Pendahuluanunclassified
See 1 more Smart Citation
“…Salah satu penerapan dari neural network adalah Deep Neural Network atau DNN. Deep Neural Network adalah neural network yang tersusun dari layer yang jumlahnya lebih dari satu [11]. Penelitian yang menggunakan DNN dilakukan oleh [12] untuk menganalisa sentimen pada Twitter berbahasa Indonesia mengenai institusi pemerintahan dan tokoh pemerintahan.…”
Section: Pendahuluanunclassified
“…Hasil perhitungan ini kemudian dihitung lagi dengan menggunakan fungsi aktivasi yang merupakan keluaran dari layer tersebut. Persamaan 1 merupakan perhitungan pada setiap layer[11].…”
unclassified
“…Neural Network model optimization techniques such as efficient network design, pruning, and quantization are essential for efficient inference on hardware in real-time. Pruning [4,5,11,12] is a method to reduce the size and computational complexity of a network as it removes redundant connections/neurons which do not significantly contribute to the model accuracy. Weight/Connection-wise pruning is irregular in nature and hence introduces non-uniform sparsity in the weight matrices.…”
Section: Introductionmentioning
confidence: 99%