2019
DOI: 10.1109/tcsi.2019.2906155
|View full text |Cite
|
Sign up to set email alerts
|

Using Machine Learning Techniques to Evaluate Multicore Soft Error Reliability

Abstract: Virtual platform frameworks have been extended to allow earlier soft error analysis of more realistic multicore systems (i.e., real software stacks, state-of-the-art ISAs). The high observability and simulation performance of underlying frameworks enable to generate and collect more error/failurerelated data, considering complex software stack configurations, in a reasonable time. When dealing with sizeable failure-related data sets obtained from multiple fault campaigns, it is essential to filter out paramete… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3

Relationship

2
6

Authors

Journals

citations
Cited by 38 publications
(18 citation statements)
references
References 24 publications
0
18
0
Order By: Relevance
“…In the context of soft error assessment, with the exception of [34], [36] and this work, reviewed approaches do not consider resource-constraint on their experiments. The majority of these works consider either FPGA implementations of ML algorithms [9], [32], [35] or their execution on GPU [30], DNN accelerators [10], [29], [37] or generalpurpose processors [31], [33]. On the soft error mitigation side, traditional partial TMR or specific mitigation techniques have been considered either in FPGA implementations [9] or applied to specialized hardware accelerator [10] or more generic GPUs [30].…”
Section: B Review Of Soft Error Assessment Of ML Algorithmsmentioning
confidence: 99%
“…In the context of soft error assessment, with the exception of [34], [36] and this work, reviewed approaches do not consider resource-constraint on their experiments. The majority of these works consider either FPGA implementations of ML algorithms [9], [32], [35] or their execution on GPU [30], DNN accelerators [10], [29], [37] or generalpurpose processors [31], [33]. On the soft error mitigation side, traditional partial TMR or specific mitigation techniques have been considered either in FPGA implementations [9] or applied to specialized hardware accelerator [10] or more generic GPUs [30].…”
Section: B Review Of Soft Error Assessment Of ML Algorithmsmentioning
confidence: 99%
“…While authors in [8,21] employ machine learning to reduce the time needed for the fault injection campaigns, Rosa et al [9] promote a module that uses ML algorithms to correlate large subsets of application profiles and architecture characteristics with fault injection results. The developed ML-based module reduces user intervention and enables the identification of relevant relationships or associations between application profiling and specific single or multicore platform parameters.…”
Section: Soft Error Assessment Capable Vp Frameworkmentioning
confidence: 99%
“…With this in mind, researchers and market leaders are investigating new alternatives, such as the use of virtual platform fault injection (FI) frameworks [2][3][4][5][6][7][8][9][10]. Such frameworks allow enormous productivity…”
Section: Introductionmentioning
confidence: 99%
“…Considering the application of non-intrusive SIHFT methods, the works described in [20] and [21] employed genetic algorithms that select the best compilation flags of certain applications for their reliability optimization. Rocha et al [22] have recently proposed a soft-error score, which combines supervised and unsupervised ML algorithms, to correlate application profiles and processor characteristics with fault-injection results. The score was devised to assist researchers with the identification of the parameters that have most influence on the reliability of the final application.…”
Section: Related Workmentioning
confidence: 99%
“…Therefore, they can be used to obtain the fault coverage statistics, in terms of Silent Data Corruption (SDC), when an erroneous output is produced, and in terms of HANG when the platform stops working [29]. In particular, simulators such as [22], [36], [37] can estimate, given a program and its configuration, both the SDC and the HANG values. In both cases, the simulated fault injection was separately performed on each register and the memory was also injected, taking into account the program data, the program code and the stack section.…”
Section: A Duts -Devices Under Test -Arm and Msp430mentioning
confidence: 99%