2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA) 2014
DOI: 10.1109/isca.2014.6853212
|View full text |Cite
|
Sign up to set email alerts
|

GangES: Gang error simulation for hardware resiliency evaluation

Abstract: As technology scales, the hardware reliability challenge affects a broad computing market, rendering traditional redundancy based solutions too expensive. Software anomaly based hardware error detection has emerged as a low cost reliability solution, but suffers from Silent Data Corruptions (SDCs). It is crucial to accurately evaluate SDC rates and identify SDC producing software locations to develop software-centric low-cost hardware resiliency solutions.A recent tool, called Relyzer, systematically analyzes … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
9
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 28 publications
(10 citation statements)
references
References 24 publications
1
9
0
Order By: Relevance
“…Figure 4.4B plots the estimated SDC rate by the regression model against the actual SDC rate for 5 benchmark programs with 10 different compiler optimizations. As can be seen from the figure, the estimated SDC rate by the model has very poor correlation with the actual SDC rate (a similar result was obtained by Hari et al [12]). Therefore, it is nontrivial to determine the error resilience of the program using these factors, and hence we use a fault injection experiment for fitness function and score calculation in our approaches.…”
supporting
confidence: 82%
“…Figure 4.4B plots the estimated SDC rate by the regression model against the actual SDC rate for 5 benchmark programs with 10 different compiler optimizations. As can be seen from the figure, the estimated SDC rate by the model has very poor correlation with the actual SDC rate (a similar result was obtained by Hari et al [12]). Therefore, it is nontrivial to determine the error resilience of the program using these factors, and hence we use a fault injection experiment for fitness function and score calculation in our approaches.…”
supporting
confidence: 82%
“…In [30] a fault injection tool based on the cycle accurate full system simulator Gem5 is proposed. Authors of [14] propose a technique to reduce the fault simulation time through grouping error simulations that produce same intermediate execution state. In [25], a statistical method to estimate the outcome of a system in presence of soft errors is proposed.…”
Section: Resultsmentioning
confidence: 99%
“…The intrinsic limitations of fault-injection on COTS make it impossible to correlate each physical radiation-induced error with its manifestation at the output. However, by corrupting variable values, we are able to identify those parts of the code that are likely to affect HOG execution [Hari et al 2014]. In other words, we can calculate the SDC rate for HOG, which is the percentage of injected errors that caused SDCs.…”
Section: Experimental Methodologymentioning
confidence: 99%