2020
DOI: 10.1109/tcsi.2020.3019460
|View full text |Cite
|
Sign up to set email alerts
|

Weight-Oriented Approximation for Energy-Efficient Neural Network Inference Accelerators

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
79
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
4
1

Relationship

2
7

Authors

Journals

citations
Cited by 59 publications
(83 citation statements)
references
References 38 publications
4
79
0
Order By: Relevance
“…For example, existing NN architectures trade throughput (e.g., [5] combines many MADD units to enable higher computational precision) or speed (e.g., [30] uses 10bit MADD units that are 1.15x slower than the 8-bit ones) to achieve higher inference accuracy. Similarly, [19], [32], [33] apply approximations and trade accuracy to improve the speed and/or energy consumption. As a result, NC-FinFET provides new insights and new directions to future NN accelerators architects as well as NN developers.…”
Section: Neural Network Inference Evaluationmentioning
confidence: 99%
“…For example, existing NN architectures trade throughput (e.g., [5] combines many MADD units to enable higher computational precision) or speed (e.g., [30] uses 10bit MADD units that are 1.15x slower than the 8-bit ones) to achieve higher inference accuracy. Similarly, [19], [32], [33] apply approximations and trade accuracy to improve the speed and/or energy consumption. As a result, NC-FinFET provides new insights and new directions to future NN accelerators architects as well as NN developers.…”
Section: Neural Network Inference Evaluationmentioning
confidence: 99%
“…Acknowledging the need for runtime reconfiguration, [9] generates approximate multipliers with dynamically reconfigurable accuracy and uses them to apply layer-wise approximation in DNNs by changing the multiplier's accuracy mode per layer. The work in [8] uses [9] to generate low variance approximate reconfigurable multipliers, and proposes a weight-oriented approximation for DNN inference. [15] employs a curable approximation in which the MAC's adder is split into low and high parts and the carry of low part is accumulated by the neighboring MAC unit.…”
Section: Related Workmentioning
confidence: 99%
“…As it can be seen, the energy gains increase as the value of z increases. However, the magnitude of the multiplication error, both in PE and in NE, becomes larger as well, as calculated by (8). Therefore, in Section III-B we present a method to map the weights to specific modes in order to keep the overall inference accuracy loss low.…”
Section: A Positive/negative Approximate Multiplier In Nnsmentioning
confidence: 99%
“…The utilization of approximate multipliers in the hardware design of convolutional neural networks (CNNs) has been proposed previously to enhance the performance in terms of power, speed, and area [23][24][25][26][27][28]. Moreover, using a reconfigurable approximate multiplier based on calculating the error variance was proposed in [29]. Lower precision approximate multipliers can achieve higher performance gains as can be seen in [23][24][25][26][27].…”
Section: Introductionmentioning
confidence: 99%