Weight-Oriented Approximation for Energy-Efficient Neural Network Inference Accelerators

Tasoulas, Zois-Gerasimos; Zervakis, Georgios; Anagnostopoulos, Iraklis; Amrouch, Hussam; Henkel, Jörg

doi:10.1109/tcsi.2020.3019460

Cited by 59 publications

(83 citation statements)

References 38 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For example, existing NN architectures trade throughput (e.g., [5] combines many MADD units to enable higher computational precision) or speed (e.g., [30] uses 10bit MADD units that are 1.15x slower than the 8-bit ones) to achieve higher inference accuracy. Similarly, [19], [32], [33] apply approximations and trade accuracy to improve the speed and/or energy consumption. As a result, NC-FinFET provides new insights and new directions to future NN accelerators architects as well as NN developers.…”

Section: Neural Network Inference Evaluationmentioning

confidence: 99%

Impact of NCFET on Neural Network Accelerators

et al. 2021

Self Cite

View full text Add to dashboard Cite

This is the first work to investigate the impact that Negative Capacitance Field-Effect Transistor (NCFET) brings on the efficiency and accuracy of future Neural Networks (NN). NCFET is at the forefront of emerging technologies, especially after it has become compatible with the existing fabrication process of CMOS. Neural Network inference accelerators are becoming ubiquitous in modern SoCs and there is an ever-increasing demand for tighter and tighter throughput constraints and lower energy consumption. To explore the benefits that NCFET brings to NN inference regarding frequency, energy, and accuracy, we investigate different configurations of the multiply-add (MADD) circuit, which is the core computational unit in any NN accelerator. We demonstrate that, compared to the baseline 7nm FinFET technology, its negative capacitance counterpart reduces the energy by 55%, without any frequency reduction. In addition, it enables leveraging higher computational precision, which results to a considerable improvement in the inference accuracy. Importantly, the achieved accuracy improvement comes also together with a significant energy reduction and without any loss in frequency.

show abstract

Section: Neural Network Inference Evaluationmentioning

confidence: 99%

Impact of NCFET on Neural Network Accelerators

et al. 2021

Self Cite

View full text Add to dashboard Cite

show abstract

“…Acknowledging the need for runtime reconfiguration, [9] generates approximate multipliers with dynamically reconfigurable accuracy and uses them to apply layer-wise approximation in DNNs by changing the multiplier's accuracy mode per layer. The work in [8] uses [9] to generate low variance approximate reconfigurable multipliers, and proposes a weight-oriented approximation for DNN inference. [15] employs a curable approximation in which the MAC's adder is split into low and high parts and the carry of low part is accumulated by the neighboring MAC unit.…”

Section: Related Workmentioning

confidence: 99%

“…As it can be seen, the energy gains increase as the value of z increases. However, the magnitude of the multiplication error, both in PE and in NE, becomes larger as well, as calculated by (8). Therefore, in Section III-B we present a method to map the weights to specific modes in order to keep the overall inference accuracy loss low.…”

Section: A Positive/negative Approximate Multiplier In Nnsmentioning

confidence: 99%

Positive/Negative Approximate Multipliers for DNN Accelerators

Spantidi,

Zervakis,

Anagnostopoulos

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Recent Deep Neural Networks (DNNs) managed to deliver superhuman accuracy levels on many AI tasks. Several applications rely more and more on DNNs to deliver sophisticated services and DNN accelerators are becoming integral components of modern systems-on-chips. DNNs perform millions of arithmetic operations per inference and DNN accelerators integrate thousands of multiply-accumulate units leading to increased energy requirements. Approximate computing principles are employed to significantly lower the energy consumption of DNN accelerators at the cost of some accuracy loss. Nevertheless, recent research demonstrated that complex DNNs are increasingly sensitive to approximation. Hence, the obtained energy savings are often limited when targeting tight accuracy constraints. In this work, we present a dynamically configurable approximate multiplier that supports three operation modes, i.e., exact, positive error, and negative error. In addition, we propose a filter-oriented approximation method to map the weights to the appropriate modes of the approximate multiplier. Our mapping algorithm balances the positive with the negative errors due to the approximate multiplications, aiming at maximizing the energy reduction while minimizing the overall convolution error. We evaluate our approach on multiple DNNs and datasets against state-of-the-art approaches, where our method achieves 18.33% energy gains on average across 7 NNs on 4 different datasets for a maximum accuracy drop of only 1%.

show abstract

“…The utilization of approximate multipliers in the hardware design of convolutional neural networks (CNNs) has been proposed previously to enhance the performance in terms of power, speed, and area [23][24][25][26][27][28]. Moreover, using a reconfigurable approximate multiplier based on calculating the error variance was proposed in [29]. Lower precision approximate multipliers can achieve higher performance gains as can be seen in [23][24][25][26][27].…”

Section: Introductionmentioning

confidence: 99%