2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) 2015
DOI: 10.1109/hpca.2015.7056053
|View full text |Cite
|
Sign up to set email alerts
|

Balancing reliability, cost, and performance tradeoffs with FreeFault

Abstract: Abstract-Memory errors have been a major source of system failures and fault rates may rise even further as memory continues to scale. This increasing fault rate, especially when combined with advent of integrated on-package memories, may exceed the capabilities of traditional fault tolerance mechanisms or significantly increase their overhead. In this paper, we present FreeFault as a hardware-only, transparent, and nearlyfree resilience mechanism that is implemented entirely within a processor and can tolerat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 36 publications
(35 reference statements)
0
4
0
Order By: Relevance
“…Several solutions are available for preventing predicted UE from occurring. For example, cache lines were controlled to remove errors by isolating 8 KB capacity or less at the expense of system overhead [85][86][87][88]. However, page offline can remove UEs more than 94% by isolating 4 KB capacity without additional hardware expense [27,89].…”
Section: Error Predictionmentioning
confidence: 99%
“…Several solutions are available for preventing predicted UE from occurring. For example, cache lines were controlled to remove errors by isolating 8 KB capacity or less at the expense of system overhead [85][86][87][88]. However, page offline can remove UEs more than 94% by isolating 4 KB capacity without additional hardware expense [27,89].…”
Section: Error Predictionmentioning
confidence: 99%
“…FreeFault, RelaxFault: FreeFault [23] remaps a faulty DRAM region to a line in Last-Level Cache (LLC) during operation. After the remapping, it pins the cache line and does not access the faulty DRAM region.…”
Section: Prior Workmentioning
confidence: 99%
“…Cache based remapping ECP [41] Archshield [5] FLOWER/Fame [27] CiDRA [28] FreeFault [23] RelaxFault [24] DEFCAM [42] DRC (Ours) DEFCAM: DEFCAM [42] proposed a dynamic remapping mechanism for caches. It redirects access to a faulty cache set to one of the health sets.…”
Section: Non-cache Based Remappingmentioning
confidence: 99%
See 1 more Smart Citation