Design, Automation &Amp; Test in Europe Conference &Amp; Exhibition (DATE), 2017 2017
DOI: 10.23919/date.2017.7927224
|View full text |Cite
|
Sign up to set email alerts
|

Understanding the impact of precision quantization on the accuracy and energy of neural networks

Abstract: Recently, the posit numerical format has shown promise for DNN data representation and compute with ultralow precision ([5..8]-bit). However, majority of studies focus only on DNN inference. In this work, we propose DNN training using posits and compare with the floating point training. We evaluate on both MNIST and Fashion MNIST corpuses, where 16-bit posits outperform 16-bit floating point for end-to-end DNN training.Index Terms-Deep neural networks, low-precision arithmetic, posit numerical format I. INTROD… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
68
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 92 publications
(71 citation statements)
references
References 30 publications
1
68
0
Order By: Relevance
“…8-bit) to high-precision floating point (e.g. 32-bit) [14]. However, these works compare numerical formats with disparate bit-widths and thereby do not fairly provide a comprehensive, holistic study of the network efficiency.…”
Section: Introductionmentioning
confidence: 99%
“…8-bit) to high-precision floating point (e.g. 32-bit) [14]. However, these works compare numerical formats with disparate bit-widths and thereby do not fairly provide a comprehensive, holistic study of the network efficiency.…”
Section: Introductionmentioning
confidence: 99%
“…After decoding inputs, multiplication and converting to fixed-point is performed similarly to that of floating point. Products are accumulated in a register, or quire in the posit literature, of width qsize as given by (4). qsize = 2 es+2 × (n − 2) + 2 + log 2 (k) , n ≥ 3 (4)…”
Section: Posit Emacmentioning
confidence: 99%
“…8-bit) to a floating point high-precision (e.g. 32-bit) [4]. The utility of these studies is limited -the comparisons are across numerical formats with different bit widths and do not provide a fair understanding of the overall system efficiency.…”
Section: Introductionmentioning
confidence: 99%
“…The most popular approach is to introduce approximate computing techniques to CNNs and benefit from the fact that the applications utilizing CNNs are highly error resilient (i.e., a huge reduction in energy consumption can be obtained for an acceptable loss in accuracy) [7]. Approximate implementations of CNNs are based on various techniques such as innovative hardware architectures of CNN accelerators, simplified data representation, pruning of less significant neurons, approximate arithmetic operations, approximate memory access, weight compression and "in memory" computing [11,7,2]. For example, employing the FX operations has many advantages such as reduced (i) power consumption per arithmetic operation, (ii) memory capacity needed to store the weights and (iii) processor-memory data transfer time.…”
Section: Related Workmentioning
confidence: 99%
“…Our objective is to design and optimize not only with respect to the classification error, but also with respect to hardware resources needed when the final (trained) CNN is implemented in an embedded system with limited resources. As energy-efficient machine learning is a highly desired technology, various approximate implementations of CNNs have been introduced [7,2]. Contrasted to the existing neuroevolutionary approaches trying to minimize the classification error as much as possible and assuming that CNN is executed using floating point (FP) operations on a Graphical Processing Unit (GPU) [3, arXiv:1910.06854v1 [cs.NE] 15 Oct 2019 1], our target is a highly optimized CNN whose major parts are executed with reduced precision in fixed point (FX) arithmetic operations.…”
Section: Introductionmentioning
confidence: 99%