2019
DOI: 10.1109/jssc.2018.2865489
|View full text |Cite
|
Sign up to set email alerts
|

UNPU: An Energy-Efficient Deep Neural Network Accelerator With Fully Variable Weight Bit Precision

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
101
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 231 publications
(104 citation statements)
references
References 12 publications
1
101
0
Order By: Relevance
“…The comparison is made in terms of performance and the number of memory accesses. UNPU [47] is selected as the representative of the conventional DNN design. The reason for this selection is that UNPU provides a look-up table-based PE (LBPE) to support matrix multiplication and MAC operation.…”
Section: Evaluation Results and Analysismentioning
confidence: 99%
See 2 more Smart Citations
“…The comparison is made in terms of performance and the number of memory accesses. UNPU [47] is selected as the representative of the conventional DNN design. The reason for this selection is that UNPU provides a look-up table-based PE (LBPE) to support matrix multiplication and MAC operation.…”
Section: Evaluation Results and Analysismentioning
confidence: 99%
“…However, the performance can be improved by employing other advanced topologies [42] [45], mapping [40], and routing techniques [48]. For the memory bandwidth, we assume that the bandwidth is enough to access 16 bits of data within one cycle (i.e., similar to the assumption in UNPU [47]). With regards to the target DNN networks, we evaluated LeNet [49], MobileNet [50], and VGG-16 [4] as the representative of a small, medium, and large-scale DNN model, respectively.…”
Section: Evaluation Results and Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…By considering technology scaling, we see that the energy efficiency (in terms of TOP/s/W) of PPAC is comparable to that of the two fully-digital designs in [23], [24] but 7.9× and 2.3× lower than that of the mixed-signal designs in [6] and [19], respectively, where the latter is implemented in a comparable technology node as PPAC. As noted in Section III-D, mixedsignal designs are particularly useful for tasks that are resilient to noise or process variation, such as neural network inference.…”
Section: B Comparison With Existing Acceleratorsmentioning
confidence: 97%
“…However, one of the challenges of the modern machine learning algorithms is their energy dissipation [13]. Most of the machine learning hardware development is done using either standard cell digital design methods [14,15] or mixed-signal methods [16] employing analogue processing techniques in CMOS technologies. The advancement and scaling of CMOS technologies have always been based on improving the performance of digital systems.…”
Section: Introductionmentioning
confidence: 99%