Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead

Capra, Maurizio; Bussolino, Beatrice; Marchisio, Alberto; Masera, Guido; Martina, Maurizio; Shafique, Muhammad

doi:10.1109/access.2020.3039858

Cited by 126 publications

(69 citation statements)

References 246 publications

(277 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…However, such models usually require high computational complexity and memory footprint, so that they are not well suited for resource constrained embedded systems. The use of cloud computing is inconvenient when operational critical apparatus must be monitored, due to the low reliability and high latency of remote connections which requires enough bandwidth to guarantee real-time operations; general purpose platforms, using CPUs and GPUs have got silicon sizes, prices and energy costs which are incompatible with the integration into the apparatus to be monitored [5]. Similar limitations affect devoted processors, such as the Xilinx Deep Learning Processor Unit (DPU) core [20], introduced to accelerate CNN inference on FPGAs.…”

Section: Related Workmentioning

confidence: 99%

“…Recent DL approaches are based on Convolutional Neural Networks (CNN), Generative Adversarial Network (GAN), Recurrent Neural Networks (RNN), etc, each one having its own strengths and weaknesses [1]. However, deploying such networks in PdM systems is very challenging and costly due to the very high computational complexity, memory requirements of those models, and power consumption, which do not meet the needs for always-on monitoring of the apparatus [4], [5]. In this context, Auto-Encoders (AE) [6] are an obvious choice since they combine the use of relatively shallow networks with the possibility of unsupervised training, particularly interesting when the availability of labelled faults data is difficult.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Low-Power Detection and Classification for In-Sensor Predictive Maintenance Based on Vibration Monitoring

et al. 2022

View full text Add to dashboard Cite

In this work, a new custom design of an anomaly detection and classification system is proposed. It is composed of a convolutional Auto-Encoder (AE) hardware design to perform anomaly detection which cooperates with a mixed HW/SW Convolutional Neural Network (CNN) to perform the classification of detected anomalies. The AE features a partial binarization, so that the weights are binarized while the activations, associated to some selected layers, are non-binarized. This has been necessary to meet the severe area and energy constraints that allow it to be integrated on the same die as the MEMS sensors for which it serves as a neural accelerator. The CNN shares the feature extraction module with the AE, whereas a SW classifier is triggered by the AE when a fault is detected, working asynchronously to it. The AE has been mapped on a Xilinx Artix-7 FPGA, featuring an Output Data Rate (ODR) of 365 kHz and achieving a power dissipation of 333 W/MHz. Logic synthesis has targeted TSMC CMOS 65 nm, 90 nm, and 130 nm standard cells. Best results achieved highlight a power consumption of 138 μW/MHz with an area occupation of 0.49 mm 2 when real-time operations are set. These results enable the integration of the complete neural accelerator in the CMOS circuitry that typically sits with the inertial MEMS on the same silicon die. Comparisons with the related works suggest that the proposed system is capable of state-of-the-art performances and accuracy.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Low-Power Detection and Classification for In-Sensor Predictive Maintenance Based on Vibration Monitoring

et al. 2022

View full text Add to dashboard Cite

show abstract

“…A commonly overlooked issue with advanced machine learning methods is their energy consumption: model training and development likely make up a substantial portion of the greenhouse gas emissions because of excessively large models used, for example, in natural language processing [30]. An important solution direction is to use special hardware designed for energy-efficient learning [31,32].…”

Section: Related Research On Ai Issuesmentioning

confidence: 99%

Framework for Assessing Ethical Aspects of Algorithms and Their Encompassing Socio-Technical System

Bruxvoort

Keulen

2021

Applied Sciences

View full text Add to dashboard Cite

In the transition to a data-driven society, organizations have introduced data-driven algorithms that often apply artificial intelligence. In this research, an ethical framework was developed to ensure robustness and completeness and to avoid and mitigate potential public uproar. We take a socio-technical perspective, i.e., view the algorithm embedded in an organization with infrastructure, rules, and procedures as one to-be-designed system. The framework consists of five ethical principles: beneficence, non-maleficence, autonomy, justice, and explicability. It can be used during the design for identification of relevant concerns. The framework has been validated by applying it to real-world fraud detection cases: Systeem Risico Indicatie (SyRI) of the Dutch government and the algorithm of the municipality of Amersfoort. The former is a controversial country-wide algorithm that was ultimately prohibited by court. The latter is an algorithm in development. In both cases, it proved effective in identifying all ethical risks. For SyRI, all concerns found in the media were also identified by the framework, mainly focused on transparency of the entire socio-technical system. For the municipality of Amersfoort, the framework highlighted risks regarding the amount of sensitive data and communication to and with the public, presenting a more thorough overview compared to the risks the media raised.

show abstract

“…This increased complexity hinders the deployment of advanced NNs (DNNs and SNNs) on resourceconstrained edge devices [4]. Therefore, optimizations at different system layers (i.e., HW and SW) are necessary to enable the use of advanced NNs at the edge [2]. Besides performance and energy efficiency, reliability and security aspects are also important to ensure Fig.…”

Section: Introductionmentioning

confidence: 99%

Towards Energy-Efficient and Secure Edge AI: A Cross-Layer Framework

Shafique,

Marchisio,

Putra

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

The security and privacy concerns along with the amount of data that is required to be processed on regular basis has pushed processing to the edge of the computing systems. Deploying advanced Neural Networks (NN), such as deep neural networks (DNNs) and spiking neural networks (SNNs), that offer state-of-the-art results on resourceconstrained edge devices is challenging due to the stringent memory and power/energy constraints. Moreover, these systems are required to maintain correct functionality under diverse security and reliability threats. This paper first discusses existing approaches to address energy efficiency, reliability, and security issues at different system layers, i.e., hardware (HW) and software (SW). Afterward, we discuss how to further improve the performance (latency) and the energy efficiency of Edge AI systems through HW/SW-level optimizations, such as pruning, quantization, and approximation. To address reliability threats (like permanent and transient faults), we highlight cost-effective mitigation techniques, like fault-aware training and mapping. Moreover, we briefly discuss effective detection and protection techniques to address security threats (like model and data corruption). Towards the end, we discuss how these techniques can be combined in an integrated cross-layer framework for realizing robust and energy-efficient Edge AI systems.

show abstract

Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead

Cited by 126 publications

References 246 publications

Low-Power Detection and Classification for In-Sensor Predictive Maintenance Based on Vibration Monitoring

Low-Power Detection and Classification for In-Sensor Predictive Maintenance Based on Vibration Monitoring

Framework for Assessing Ethical Aspects of Algorithms and Their Encompassing Socio-Technical System

Towards Energy-Efficient and Secure Edge AI: A Cross-Layer Framework

Contact Info

Product

Resources

About