2014 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT) 2014
DOI: 10.1109/dft.2014.6962069
|View full text |Cite
|
Sign up to set email alerts
|

Characterization of data retention faults in DRAM devices

Abstract: Dynamic random access memory (DRAM) is the most widely used type of memory in the consumer market today, and it is still widely used for mass memories for space application. Even though accurate tests are performed by vendors to ensure high reliability, DRAM errors continue to be a common source of failures in the field. Recent large-scale studies reported how most of the errors experienced by DRAM subsystem are due to faults repeating on the same memory address but occurring only under specific condition. As … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
8
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(8 citation statements)
references
References 16 publications
0
8
0
Order By: Relevance
“…In turn, choosing an e ective test methodology requires knowledge of basic properties about a DRAM chip's design and/or error mechanisms 1 . For example, DRAM manufacturer's design choices for the sizes of internal storage arrays (i.e., mats [36,69,140,264]), charge encoding conventions of each cell (i.e., the true-and anti-cell organization [98,189]), use of on-die reliability-improving mechanisms (e.g., on-die ECC, TRR), and organization of row and column addresses all play key roles in determining if and how susceptible a DRAM chip is to key error mechanisms (e.g., data retention [95,98,189,191,[265][266][267], access-latency-related failures [37, 39,…”
Section: Information Flow During Testingmentioning
confidence: 99%
“…In turn, choosing an e ective test methodology requires knowledge of basic properties about a DRAM chip's design and/or error mechanisms 1 . For example, DRAM manufacturer's design choices for the sizes of internal storage arrays (i.e., mats [36,69,140,264]), charge encoding conventions of each cell (i.e., the true-and anti-cell organization [98,189]), use of on-die reliability-improving mechanisms (e.g., on-die ECC, TRR), and organization of row and column addresses all play key roles in determining if and how susceptible a DRAM chip is to key error mechanisms (e.g., data retention [95,98,189,191,[265][266][267], access-latency-related failures [37, 39,…”
Section: Information Flow During Testingmentioning
confidence: 99%
“…As process scaling continues, intermittent faults are of growing concern. Several field studies provide their characteristics [5], [7], [8], [30]: First, intermittent faults have been outgrowing other faults and will be the dominant challenge in memory reliability. Second, it is difficult to screen out all intermittent faults with testing.…”
Section: Dram Faults and Errorsmentioning
confidence: 99%
“…The importance of DRAM reliability is further growing as mission-critical systems (e.g., autonomous driving) require higher reliability and systems deploy more DRAM chips. Despite the demand, individual DRAM chips are becoming more vulnerable to faults and errors [5]- [8]. Process scaling has shrunk DRAM cells into nm-scale feature size and fF-scale cell capacitance.…”
Section: Introductionmentioning
confidence: 99%
“…We make three observations. First, HARP-U does not identify any bits at risk of indirect errors 11 because it bypasses the on-die ECC correction process that causes indirect errors. In contrast, HARP-A quickly identifies a subset of all bits at risk of indirect errors by predicting them from the identified direct errors.…”
mentioning
confidence: 99%
“…9 shows the worst-case (i.e., maximum) number of post-correction errors that can occur simultaneously within an ECC word after active profiling. 11 Except for a small number of direct and indirect errors that overlap. This number is the correction capability required from secondary ECC to safely perform reactive profiling.…”
mentioning
confidence: 99%