2019 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT) 2019
DOI: 10.1109/dft.2019.8875269
|View full text |Cite
|
Sign up to set email alerts
|

A Comprehensive Evaluation of the Effects of Input Data on the Resilience of GPU Applications

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
1
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 23 publications
0
1
0
Order By: Relevance
“…In detailed terms, we employed three types of tiles: random (R) tiles with a random distribution of values, zero (Z) tiles with numbers close to zero, and triangular (T) tiles with a triangular distribution of values. This selection is also in concordance with other works in the field that have argued that fault propagation in the GPU's data path is data-independent if the input data (tiles) are not biased (i.e., composed of an excessive amount of 0 s or all 1 s) [59].…”
Section: Fault Evaluation and Error Propagationsupporting
confidence: 88%
“…In detailed terms, we employed three types of tiles: random (R) tiles with a random distribution of values, zero (Z) tiles with numbers close to zero, and triangular (T) tiles with a triangular distribution of values. This selection is also in concordance with other works in the field that have argued that fault propagation in the GPU's data path is data-independent if the input data (tiles) are not biased (i.e., composed of an excessive amount of 0 s or all 1 s) [59].…”
Section: Fault Evaluation and Error Propagationsupporting
confidence: 88%
“…Oliveira et al [140] provide an extensive description of software mitigation techniques for non-safety-related uses of GPUs. In addition to this, there is a reported correlation between the measured reliability and the executed applications [63,78,159], and even the explored application input sizes and biased input values [158,159] (e.g., 30% failure rate increase with input size change [159]).…”
Section: Software-only Techniquesmentioning
confidence: 89%
“…-Detection and fault-tolerance algorithmic approaches [35,138,169,170,168,195,140,49,50,110,139,156,102,202,80,204,58,76] -Fault-tolerance based on intrinsic application and/or input data characteristics [63,78,159,158] §4…”
Section: Random Hw Failuresmentioning
confidence: 99%
“…Other works [8], [19] evaluated the reliability of GPUs by performing extensive fault campaigns at two abstraction levels (application and architectural). The application analysis provides some details about the vulnerability of a GPU without direct correlation with the fault effect caused in internal modules.…”
Section: A Methodologies For Reliability Evaluationmentioning
confidence: 99%