2014 Symposium on VLSI Circuits Digest of Technical Papers 2014
DOI: 10.1109/vlsic.2014.6858414
|View full text |Cite
|
Sign up to set email alerts
|

Early detection and repair of VRT and aging DRAM bits by margined in-field BIST

Abstract: We propose improving system availability by performing in-field repair at the chip level. This enables margining and detection of degrading memory cells before the user observes any errors. A 576 Mb embedded DRAM at 1.5 GHz in a 40nm CMOS technology achieves improved resilience to both aging memory cells and cells with variable retention time (VRT). Un-interrupted user access of 6 billion 72-bit read and write operations per second is maintained during background repair. IntroductionThe common approach to memo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 7 publications
(5 reference statements)
0
3
0
Order By: Relevance
“…Nonetheless, a particular event may start an OS service which will affect DRAM reliability. Beside this, DRAM reliability can be affected by alpha-particles [30] and degrade over time due to aging [31]. Finally, the integrity of data in DRAM can be compromised by row hammer attacks [12], especially when we relax…”
Section: Incorrect Predictions and Side Effectsmentioning
confidence: 99%
“…Nonetheless, a particular event may start an OS service which will affect DRAM reliability. Beside this, DRAM reliability can be affected by alpha-particles [30] and degrade over time due to aging [31]. Finally, the integrity of data in DRAM can be compromised by row hammer attacks [12], especially when we relax…”
Section: Incorrect Predictions and Side Effectsmentioning
confidence: 99%
“…While some techniques involve fault-specific adaptation of the ECC code [3], a more general approach is to retire faulty components and potentially replace them. Such retire/replace techniques generally fall into one of six main categories (from least to most intrusive in terms of hardware design): (1) retire entire nodes in a large system (software); (2) retire memory ranks or channels (hardware); (3) retire memory frames at OS page granularity (software) [18,25,59]; (4) retire individual chips in a rank, which necessitates a reduction in ECC coverage of additional faults (hardware) [18,25]; (5) compensate for reduced ECC by increasing access granularity with memory-channel coupling (hardware) [25]; and (6) fine-grained retirement with remapping to redundant storage (hardware) [45,58,18,25,38].…”
Section: Related Workmentioning
confidence: 99%
“…An economical alternative is to retire and possibly remap just the faulty memory regions. There are three broad categories for retirement techniques (we refine this classification and provide more detail in Section VI): (1) retire memory and reduce available memory capacity (e.g., node-level, channel-level, or OS frame level), which is costly in lost resources and difficult and complicated to implement in some systems [12]; (2) retire a faulty chip in a memory module and either change protection level or compensate protection by coupling memory channels, which then impacts performance and power [9,18,25,28]; and (3) remap faulty addresses to alternative memory locations, which currently requires redundant memory and adds latency and complexity for remapping hardware [58,45,44,38].…”
Section: Introductionmentioning
confidence: 99%