On the Resilience of RTL NN Accelerators: Fault Characterization and Mitigation

Salami, Behzad; Ünsal, Osman; Kestelman, Adrián Cristal

doi:10.1109/cahpc.2018.8645906

Cited by 72 publications

(62 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…• 2 We observed that various layers of the given NN have different inherent vulnerability to faults. We conducted a pre-processing analysis and observed that inner layers (layers closer to the output) are relatively more vulnerable, as similarly observed in [31], [32], [33], since faults in these layers have relatively less probability to be masked through the quantification in the activation functions. The sensitivity of NN layers, i.e., {Layer j , j ∈ [0, 4]} is evaluated by injecting simulated randomlygenerated faults in corresponding weights of individual layers at the Register-Transfer Level (RTL).…”

Section: Fault Mitigation Techniquementioning

confidence: 70%

“…Ares [61] is a framework for quantifying the resilience of deep neural networks. Also, [31] studied an RTL model of the NN from resilience perspective by injecting faults in the registers of the design. Also, recently [33] studied the fault propagation in an ASIC model of NN focused on the vulnerability of different NN layers.…”

Section: B Recent Related Studies On Nnsmentioning

confidence: 99%

See 1 more Smart Citation

Comprehensive Evaluation of Supply Voltage Underscaling in FPGA on-Chip Memories

Salami

Ünsal

Kestelman

2018

2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)

Self Cite

View full text Add to dashboard Cite

In this work, we evaluate aggressive undervolting, i.e., voltage scaling below the nominal level to reduce the energy consumption of Field Programmable Gate Arrays (FPGAs). Usually, voltage guardbands are added by chip vendors to ensure the worst-case process and environmental scenarios. Through experimenting on several FPGA architectures, we measure this voltage guardband to be on average 39% of the nominal level, which in turn, delivers more than an order of magnitude power savings. However, further undervolting below the voltage guardband may cause reliability issues as the result of the circuit delay increase, i.e., start to appear faults. We extensively characterize the behavior of these faults in terms of the rate, location, type, as well as sensitivity to environmental temperature, with a concentration of on-chip memories, or Block RAMs (BRAMs). Finally, we evaluate a typical FPGA-based Neural Network (NN) accelerator under low-voltage BRAM operations. In consequence, the substantial NN energy savings come with the cost of NN accuracy loss. To attain power savings without NN accuracy loss, we propose a novel technique that relies on the deterministic behavior of undervolting faults and can limit the accuracy loss to 0.1% without any timing-slack overhead.• This paper is the first effort to empirically study aggressive voltage underscaling of FPGAs below the standard nominal level. Through experimenting on four FPGA platforms, we confirm a large guardband, which is measured to be on average 39% of the nominal level for © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

show abstract

Section: Fault Mitigation Techniquementioning

confidence: 70%

Section: B Recent Related Studies On Nnsmentioning

confidence: 99%

Comprehensive Evaluation of Supply Voltage Underscaling in FPGA on-Chip Memories

Salami

Ünsal

Kestelman

2018

2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)

Self Cite

View full text Add to dashboard Cite

show abstract

“…Hence, recently, the resilience of DNNs has been studied in different abstraction levels. A vast majority of the previous works in this area belong to the DNN inference phase, including simulationbased efforts [33]- [36] and works on the real hardware [12], [37]- [39]. The verification of the simulation-based works on the real fabric can be a crucial concern; also, the real hardware works are mostly performed on the customized ASICs, which of course, reproducing those results on the COTS systems is a crucial question.…”

Section: Related Workmentioning

confidence: 99%

On the Resilience of Deep Learning for Reduced-voltage FPGAs

Givaki

Salami

Hojabr

et al. 2020

2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)

Self Cite

View full text Add to dashboard Cite

Deep Neural Networks (DNNs) are inherently computation-intensive and also power-hungry. Hardware accelerators such as Field Programmable Gate Arrays (FPGAs) are a promising solution that can satisfy these requirements for both embedded and High-Performance Computing (HPC) systems. In FPGAs, as well as CPUs and GPUs, aggressive voltage scaling below the nominal level is an effective technique for power dissipation minimization. Unfortunately, bit-flip faults start to appear as the voltage is scaled down closer to the transistor threshold due to timing issues, thus creating a resilience issue. This paper experimentally evaluates the resilience of the training phase of DNNs in the presence of voltage underscaling related faults of FPGAs, especially in on-chip memories. Toward this goal, we have experimentally evaluated the resilience of LeNet-5 and also a specially designed network for CIFAR-10 dataset with different activation functions of Rectified Linear Unit (Relu) and Hyperbolic Tangent (Tanh). We have found that modern FPGAs are robust enough in extremely low-voltage levels and that low-voltage related faults can be automatically masked within the training iterations, so there is no need for costly software-or hardware-oriented fault mitigation techniques like ECC. Approximately 10% more training iterations are needed to fill the gap in the accuracy. This observation is the result of the relatively low rate of undervolting faults, i.e., <0.1%, measured on real FPGA fabrics. We have also increased the fault rate significantly for the LeNet-5 network by randomly generated fault injection campaigns and observed that the training accuracy starts to degrade. When the fault rate increases, the network with Tanh activation function outperforms the one with Relu in terms of accuracy, e.g., when the fault rate is 30% the accuracy difference is 4.92%.

show abstract

“…ThUnderVolt [178] proposes to underscale the voltage of arithmetic elements. Salami et al [141] and Zhang et al [179] present fault-mitigation techniques for neural networks that minimize errors in faulty registers and logic blocks with pruning and retraining.…”

Section: Related Workmentioning

confidence: 99%

“…Sixth, works that study the intrinsic error resilience of DNNs by injecting randomly-distributed errors in DNN data [110,110,141,163,166,179,180]. These works assume that the errors can come from any component of the system (i.e., they do not target a specific approximate hardware component).…”

Section: Related Workmentioning

confidence: 99%

Eden

Koppula

Orosa

Yağlıkçı

et al. 2019

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture

View full text Add to dashboard Cite

The effectiveness of deep neural networks (DNN) in vision, speech, and language processing has prompted a tremendous demand for energy-efficient high-performance DNN inference systems. Due to the increasing memory intensity of most DNN workloads, main memory can dominate the system's energy consumption and stall time. One effective way to reduce the energy consumption and increase the performance of DNN inference systems is by using approximate memory, which operates with reduced supply voltage and reduced access latency parameters that violate standard specifications. Using approximate memory reduces reliability, leading to higher bit error rates. Fortunately, neural networks have an intrinsic capacity to tolerate increased bit errors. This can enable energy-efficient and high-performance neural network inference using approximate DRAM devices.Based on this observation, we propose EDEN, the first general framework that reduces DNN energy consumption and DNN evaluation latency by using approximate DRAM devices, while strictly meeting a user-specified target DNN accuracy. EDEN relies on two key ideas: 1) retraining the DNN for a target approximate DRAM device to increase the DNN's error tolerance, and 2) efficient mapping of the error tolerance of each individual DNN data type to a corresponding approximate DRAM partition in a way that meets the user-specified DNN accuracy requirements.We evaluate EDEN on multi-core CPUs, GPUs, and DNN accelerators with error models obtained from real approximate DRAM devices. We show that EDEN's DNN retraining technique reliably improves the error resiliency of the DNN by an order of magnitude. For a target accuracy within 1% of the original DNN, our results show that EDEN enables 1) an average DRAM energy reduction of 21%, 37%, 31%, and 32% in CPU, GPU, and two different DNN accelerator architectures, respectively, across a variety of state-ofthe-art networks, and 2) an average (maximum) speedup of 8% (17%) and 2.7% (5.5%) in CPU and GPU architectures, respectively, when evaluating latency-bound neural networks.

show abstract

On the Resilience of RTL NN Accelerators: Fault Characterization and Mitigation

Cited by 72 publications

References 35 publications

Comprehensive Evaluation of Supply Voltage Underscaling in FPGA on-Chip Memories

Comprehensive Evaluation of Supply Voltage Underscaling in FPGA on-Chip Memories

On the Resilience of Deep Learning for Reduced-voltage FPGAs

Eden

Contact Info

Product

Resources

About