An Energy-efficient Reconfigurable Hybrid DNN Architecture for Speech Recognition with Approximate Computing

One major challenge in deploying Deep Neural Network (DNN) in resource-constrained applications, such as edge nodes, mobile embedded systems, and IoT devices, is its high energy cost. The emerging approximate computing methodology can effectively reduce the energy consumption during the computing process in DNN. However, a recent study shows that the weight storage and access operations can dominate DNN's energy consumption due to the fact that the huge size of DNN weights must be stored in the high-energy-cost DRAM. In this paper, we propose Double-Shift, a low-power DNN weight storage and access framework, to solve this problem. Enabled by approximate decomposition and quantization, Double-Shift can reduce the data size of the weights effectively. By designing a novel weight storage allocation strategy, Double-Shift can boost the energy efficiency by trading the energy consuming weight storage and access operations for low-energy-cost computations. Our experimental results show that Double-Shift can reduce DNN weights to 3.96%–6.38% of the original size and achieve an energy saving of 86.47%–93.62%, while introducing a DNN classification error within 2%.

show abstract

Early Prediction of DNN Activation Using Hierarchical Computations

Suresh

Pillai

Kalsi

et al. 2021

Mathematics

View full text Add to dashboard Cite

Deep Neural Networks (DNNs) have set state-of-the-art performance numbers in diverse fields of electronics (computer vision, voice recognition), biology, bioinformatics, etc. However, the process of learning (training) from the data and application of the learnt information (inference) process requires huge computational resources. Approximate computing is a common method to reduce computation cost, but it introduces loss in task accuracy, which limits their application. Using an inherent property of Rectified Linear Unit (ReLU), a popular activation function, we propose a mathematical model to perform MAC operation using reduced precision for predicting negative values early. We also propose a method to perform hierarchical computation to achieve the same results as IEEE754 full precision compute. Applying this method on ResNet50 and VGG16 shows that up to 80% of ReLU zeros (which is 50% of all ReLU outputs) can be predicted and detected early by using just 3 out of 23 mantissa bits. This method is equally applicable to other floating-point representations.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

An Energy-efficient Reconfigurable Hybrid DNN Architecture for Speech Recognition with Approximate Computing

Cited by 3 publications

References 22 publications

Using a Hybrid Deep Neural Network for Gas Classification

Using a Hybrid Deep Neural Network for Gas Classification

Double-Shift: A Low-Power DNN Weights Storage and Access Framework based on Approximate Decomposition and Quantization

Early Prediction of DNN Activation Using Hierarchical Computations

Contact Info

Product

Resources

About