2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA) 2018
DOI: 10.1109/isca.2018.00070
|View full text |Cite
|
Sign up to set email alerts
|

Gist: Efficient Data Encoding for Deep Neural Network Training

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
51
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 114 publications
(51 citation statements)
references
References 13 publications
0
51
0
Order By: Relevance
“…The actual computation in the sparse GEMM (described below) still uses single precision (FP32) by converting FP16 data back to FP32. Therefore, the precision lost during conversion has little impact on the overall training performance [10], [33]. We use the second 16 bits to represent the row index.…”
Section: Ellpack-dib Based Gemmmentioning
confidence: 99%
See 2 more Smart Citations
“…The actual computation in the sparse GEMM (described below) still uses single precision (FP32) by converting FP16 data back to FP32. Therefore, the precision lost during conversion has little impact on the overall training performance [10], [33]. We use the second 16 bits to represent the row index.…”
Section: Ellpack-dib Based Gemmmentioning
confidence: 99%
“…The computer architecture community has explored methods to improve execution efficiency and memory usage when processing DNNs. Of the many approaches pursued, leveraging weight/activation sparsity has attracted a lot of attention [8], [9], [10], [11], [12], [13]. Prior studies have explored exploiting sparsity to accelerate DNN computations during both training and inference on customized platforms (e.g., FPGAs and ASICs) [4], [8], [9], [12].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, it is essential to reduce the memory requirements to allow better network training and deployment, such as applying deep CNNs to embedded systems and cell phones. Several studies [4] show that the intermediate layer outputs (feature maps) are the primary contributors to this memory bottleneck. Existing methods such as model compression [5,6] and scheduling [7], do not directly address the storage of feature maps.…”
Section: Introductionmentioning
confidence: 99%
“…The neural network is never satiated with the current speed [Lu and Liang, 2018;Yu et al, 2017;Zhao et al, 2017]. At the moment of the advent of the neural network, the research on the accelerating these networks also began [Posewsky and Ziener, 2018;Jain et al, 2018;Zhang et al, 2015]. The representative acceleration methods are mainly built on the FFT[Mathieu et al, 2013], which fully utilizes the component reuse in the frequency domain.…”
Section: Introductionmentioning
confidence: 99%