Using Machine Learning Techniques to Evaluate Multicore Soft Error Reliability

Rosa, Felipe; Garibotti, Rafael; Ost, Luciano; Reis, Ricardo

doi:10.1109/tcsi.2019.2906155

Cited by 38 publications

(18 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the context of soft error assessment, with the exception of [34], [36] and this work, reviewed approaches do not consider resource-constraint on their experiments. The majority of these works consider either FPGA implementations of ML algorithms [9], [32], [35] or their execution on GPU [30], DNN accelerators [10], [29], [37] or generalpurpose processors [31], [33]. On the soft error mitigation side, traditional partial TMR or specific mitigation techniques have been considered either in FPGA implementations [9] or applied to specialized hardware accelerator [10] or more generic GPUs [30].…”

Section: B Review Of Soft Error Assessment Of ML Algorithmsmentioning

confidence: 99%

Applying Lightweight Soft Error Mitigation Techniques to Embedded Mixed Precision Deep Neural Networks

Abich

Gava

Garibotti

et al. 2021

IEEE Trans. Circuits Syst. I

Self Cite

View full text Add to dashboard Cite

Deep neural networks (DNNs) are being incorporated in resource-constrained IoT devices, which typically rely on reduced memory footprint and low-performance processors. While DNNs' precision and performance can vary and are essential, it is also vital to deploy trained models that provide high reliability at low cost. To achieve an unyielding reliability and safety level, it is imperative to provide electronic computing systems with appropriate mechanisms to tackle soft errors. This paper, therefore, investigates the relationship between soft errors and model accuracy. In this regard, an extensive soft error assessment of the MobileNet model is conducted considering precision bitwidth variations (2, 4, and 8 bits) running on an Arm Cortex-M processor. In addition, this work promotes the use of a register allocation technique (RAT) that allocates the critical DNN function/layer to a pool of specific general-purpose processor registers. Results obtained from more than 4.5 million fault injections show that RAT gives the best relative performance, memory utilization, and soft error reliability trade-offs w.r.t. a more traditional replication-based approach. Results also show that the MobileNet soft error reliability varies depending on the precision bitwidth of its convolutional layers.

show abstract

Section: B Review Of Soft Error Assessment Of ML Algorithmsmentioning

confidence: 99%

Applying Lightweight Soft Error Mitigation Techniques to Embedded Mixed Precision Deep Neural Networks

Abich

Gava

Garibotti

et al. 2021

IEEE Trans. Circuits Syst. I

Self Cite

View full text Add to dashboard Cite

show abstract

“…While authors in [8,21] employ machine learning to reduce the time needed for the fault injection campaigns, Rosa et al [9] promote a module that uses ML algorithms to correlate large subsets of application profiles and architecture characteristics with fault injection results. The developed ML-based module reduces user intervention and enables the identification of relevant relationships or associations between application profiling and specific single or multicore platform parameters.…”

Section: Soft Error Assessment Capable Vp Frameworkmentioning

confidence: 99%

“…With this in mind, researchers and market leaders are investigating new alternatives, such as the use of virtual platform fault injection (FI) frameworks [2][3][4][5][6][7][8][9][10]. Such frameworks allow enormous productivity…”

Section: Introductionmentioning

confidence: 99%

Sofia: An Automated Framework for Early Soft Error Assessment, Identification, and Mitigation

Gava¹,

Bandeira²,

Rosa³

et al. 2022

SSRN Journal

Self Cite

View full text Add to dashboard Cite

“…Considering the application of non-intrusive SIHFT methods, the works described in [20] and [21] employed genetic algorithms that select the best compilation flags of certain applications for their reliability optimization. Rocha et al [22] have recently proposed a soft-error score, which combines supervised and unsupervised ML algorithms, to correlate application profiles and processor characteristics with fault-injection results. The score was devised to assist researchers with the identification of the parameters that have most influence on the reliability of the final application.…”

Section: Related Workmentioning

confidence: 99%

“…Therefore, they can be used to obtain the fault coverage statistics, in terms of Silent Data Corruption (SDC), when an erroneous output is produced, and in terms of HANG when the platform stops working [29]. In particular, simulators such as [22], [36], [37] can estimate, given a program and its configuration, both the SDC and the HANG values. In both cases, the simulated fault injection was separately performed on each register and the memory was also injected, taking into account the program data, the program code and the stack section.…”

Section: A Duts -Devices Under Test -Arm and Msp430mentioning

confidence: 99%

Empirical Mathematical Model of Microprocessor Sensitivity and Early Prediction to Proton and Neutron Radiation-Induced Soft Errors

Serrano-Cases

Reyneri

Morilla

et al. 2020

IEEE Trans. Nucl. Sci.

View full text Add to dashboard Cite

A mathematical model is described to predict microprocessor fault tolerance under radiation. The model is empirically trained by combining data from simulated faultinjection campaigns, and radiation experiments, both with protons (at the CNA facilities, Seville, Spain) and with neutrons (at the LANSCE Weapons Neutron Research facility at Los Alamos, USA). The sensitivity to soft errors of different blocks of commercial processors is identified to estimate the reliability of a set of programs that had previously been optimized, hardened, or both. The results showed a standard error under 0.1, in the case of the ARM processor, and 0.12, in the case of the MSP430 microcontroller.

show abstract

Using Machine Learning Techniques to Evaluate Multicore Soft Error Reliability

Cited by 38 publications

References 24 publications

Applying Lightweight Soft Error Mitigation Techniques to Embedded Mixed Precision Deep Neural Networks

Applying Lightweight Soft Error Mitigation Techniques to Embedded Mixed Precision Deep Neural Networks

Sofia: An Automated Framework for Early Soft Error Assessment, Identification, and Mitigation

Empirical Mathematical Model of Microprocessor Sensitivity and Early Prediction to Proton and Neutron Radiation-Induced Soft Errors

Contact Info

Product

Resources

About