Deep neural networks (DNNs) are being incorporated in resource-constrained IoT devices, which typically rely on reduced memory footprint and low-performance processors. While DNNs' precision and performance can vary and are essential, it is also vital to deploy trained models that provide high reliability at low cost. To achieve an unyielding reliability and safety level, it is imperative to provide electronic computing systems with appropriate mechanisms to tackle soft errors. This paper, therefore, investigates the relationship between soft errors and model accuracy. In this regard, an extensive soft error assessment of the MobileNet model is conducted considering precision bitwidth variations (2, 4, and 8 bits) running on an Arm Cortex-M processor. In addition, this work promotes the use of a register allocation technique (RAT) that allocates the critical DNN function/layer to a pool of specific general-purpose processor registers. Results obtained from more than 4.5 million fault injections show that RAT gives the best relative performance, memory utilization, and soft error reliability trade-offs w.r.t. a more traditional replication-based approach. Results also show that the MobileNet soft error reliability varies depending on the precision bitwidth of its convolutional layers.
To achieve a substantial reliability and safety level, it is imperative to provide electronic computing systems with appropriate mechanisms to tackle soft errors. This paper proposes a low-cost system-level soft error mitigation technique, which allocates the critical application function to a pool of specific general-purpose processor registers. Both the critical function and the register pool are automatically selected by a developed profiling tool. The proposed technique was validated through more than 400K fault injections considering a Linux kernel, different benchmarks, and two multicore Arm processor architectures (ARMv7-A and ARMv8-A). Results show that our technique significantly reduces the code size and performance overheads while providing soft error reliability improvement compared with the Triple Modular Redundancy (TMR) technique.
Soft error resilience has become an essential design metric in electronic computing systems as advanced technology nodes have become less robust to high‐charged particle effects. Designers, therefore, should be able to assess this metric considering several software stack components running on top of commercial processors, early in the design phase. With this in mind, researchers are using virtual platform (VP) frameworks to assess this metric due to their flexibility and high simulation performance. In this regard, herein, this goal is achieved by analysing the soft error consistency of a just‐in‐time fault injection simulator (OVPsim‐FIM) against fault injection campaigns conducted with event‐driven simulators (i.e. more realistic and accurate platforms) considering single and multicore processor architectures. Reference single‐core fault injection campaigns are performed on RTL descriptions of Arm Cortex‐M0 and M3 processors, while gem5 simulator is used to multicore Arm Cortex‐A9 scenarios. Campaigns consider different open‐source and commercial compilers as well as real software stacks including FreeRTOS/Linux kernels and 52 applications. Results show that OVPsim‐FIM is more than 1000× faster than cycle‐accurate simulators and up to 312× faster than event‐driven simulators, while preserving the soft error analysis accuracy (i.e. mismatch below to 10%) for single and multicore processors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.