Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units 2009
DOI: 10.1145/1513895.1513907
|View full text |Cite
|
Sign up to set email alerts
|

Understanding software approaches for GPGPU reliability

Abstract: Even though graphics processors (GPUs) are becoming increasingly popular for general purpose computing, current (and likely near future) generations of GPUs do not provide hardware support for detecting soft/hard errors in computation logic or memory storage cells since graphics applications are inherently fault tolerant. As a result, if an error occurs in GPUs during program execution, the results could be silently corrupted, which is not acceptable for general purpose computations. To improve the fidelity of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
39
0
2

Year Published

2010
2010
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 91 publications
(51 citation statements)
references
References 14 publications
0
39
0
2
Order By: Relevance
“…Dimitrov et al [24] proposed three approaches for GPGPU reliability that leverage both instruction-level parallelism and thread-level parallelism to replicate the application code. Their approach incurs performance overheads of 85 to 100%, and they conclude that understanding both the application characteristics and the hardware platform is necessary for efficient protection.…”
Section: Related Workmentioning
confidence: 99%
“…Dimitrov et al [24] proposed three approaches for GPGPU reliability that leverage both instruction-level parallelism and thread-level parallelism to replicate the application code. Their approach incurs performance overheads of 85 to 100%, and they conclude that understanding both the application characteristics and the hardware platform is necessary for efficient protection.…”
Section: Related Workmentioning
confidence: 99%
“…Sheaffer et al [5] explore the concept of the sphere of replication on GPGPU processors, and present a hardware redundancy-based approach to create a reliable GPU with no performance loss. Dimitrov [8] investigate three software approaches to perform redundant execution for GPGPU reliability. Checkpointing is a widely used protection mechanism in CPU processors, it has been applied to enhance GPGPU robustness as well.…”
Section: Related Workmentioning
confidence: 99%
“…Soft error rate (SER) has been predicted to increase exponentially [6,7]. GPGPUs with hundreds of cores integrated into a single chip are prone to manifest high SER [8]. For examples, eight soft errors were observed in a 72-hour run of testing program on 60 NVIDIA GeForce 8800GTS 512 [9].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In this case, the designer does not need to consider any fault-tolerance technique for GPU. But in the field of general-purpose computing, especially scientific computing, the reliability of the applications must be guaranteed [10].…”
Section: Fault Tolerancementioning
confidence: 99%