2019 International Conference on Field-Programmable Technology (ICFPT) 2019
DOI: 10.1109/icfpt47387.2019.00009
|View full text |Cite
|
Sign up to set email alerts
|

Training Deep Neural Networks in Low-Precision with High Accuracy Using FPGAs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
26
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 23 publications
(26 citation statements)
references
References 10 publications
0
26
0
Order By: Relevance
“…Besides, even with cloud-level resources, reduced precision and pruning approaches have also been utilized to decrease computation intensity and communication bottleneck. Although quantization adopted in prior training accelerators [4,18] led to remarkable beneits in terms of resource usage and power consumption, these works have not provided any evidence that such quantization techniques can remain high accuracy on a large dataset (e.g. ImageNet) with dense neural networks.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Besides, even with cloud-level resources, reduced precision and pruning approaches have also been utilized to decrease computation intensity and communication bottleneck. Although quantization adopted in prior training accelerators [4,18] led to remarkable beneits in terms of resource usage and power consumption, these works have not provided any evidence that such quantization techniques can remain high accuracy on a large dataset (e.g. ImageNet) with dense neural networks.…”
Section: Related Workmentioning
confidence: 99%
“…), the on-chip memory of an edge FPGA is not big enough to hold weights or features in every Conv layer. Therefore, several works [4,18,20] applied quantization or pruning to reduce of-chip memory access. However, unlike inference where compressed networks cause little accuracy decrease [7], these training works have not proved that their compression techniques can remain high accuracy on large datasets with dense networks.…”
Section: Andmentioning
confidence: 99%
“…This is not critical anyway in most machine learning applications (e.g., ANN), where a relatively reduced set of output categories or classes must be discriminated based on generic similarities. Indeed, most of the current machine learning applications use a small number of bits to represent digitized signals [41] because the difference in the final result between using high-precision floatingpoint signals and low-precision 8/16-bit signals is negligible [42]. Hence, the integration time used to evaluate the result of stochastic operations can be considerably reduced since it is exponentially dependent on the bit precision.…”
Section: Artificial Neural Network Applied To Virtual Screeningmentioning
confidence: 99%
“…Such a quantization greatly reduces the model size and computational complexity, making it suitable for hardware implementation. S. Fox et al [32] implemented a training accelerator based on 8-bit integer operations. It processes the forward and backward computations on FPGA with 8-bit integers, while the weight update computation is processed in full-precision on an ARM processor.…”
Section: Related Workmentioning
confidence: 99%
“…L. Yang et al [33] implemented a binarized neural network (BNN) on FPGA, which replaces the original binary convolution layer with two parallel binary convolutional layers for fast inference. These previous studies [22][23][24][25][26][27][28][29][30][31][32][33] follow the sequential order in processing according to the gradient descent algorithm.…”
Section: Related Workmentioning
confidence: 99%