Abstract-While Phase Change Memory (PCM) has emerged as one of most promising complements or even replacements of DRAM-based memory, it has only limited write endurance. Because of uneven write distribution, PCM is highly likely to have early failures, which can spread over the chip space and leave the entire chip unusable. Wear leveling is an indispensable technique to even out wear caused by the writes. However, because of process variation early failure cannot be fully avoided. State-of-the-art wear-leveling schemes, such as Start-Gap and Security Refresh, cease to function once even a single block failure occurs because their designs require persistent writable address space for wear leveling operations. Existent solutions attempting to address the problem demand substantial OS supports, such as explicit space allocations and data migrations. The demand on substantial OS cooperation creates a barrier to widespread adoption of the PCM technique.While fault-tolerance techniques, such as FREE-p and zombie, that remap failed blocks to inaccessible but healthy space have the potential to address the wear-leveling issue by relocating data from failed blocks to healthy ones, they cannot work together with the wear-leveling schemes as data migration may change placement of relocated data. In this paper, we propose a framework, WL-Reviver, that allows any in-PCM wear-leveling scheme to keep delivering its designed leveling service even after failures occur in its working address space. The design is unique on two aspects: (1) it leverages the fault-tolerance techniques so that they can work together with the wearleveling schemes; and (2) it requires no OS supports additional to what're available to today's DRAM-based memory system. Furthermore, WL-Reviver is a lightweight framework of very low overhead. Our extensive experiments show that WLReviver can efficiently revive a wear-leveling scheme without compromising the scheme's wear-leveling effect.
While Phase Change Memory (PCM) holds a great promise as a complement or even replacement of DRAM-based memory and flash-based storage, it must effectively overcome its limit on write endurance to be a reliable device for an extended period of intensive use. The limited write endurance can lead to permanent stuck-at faults after a certain number of writes, which causes some memory cells permanently stuck at either '0' or '1'. State-of-the-art solutions apply a bit inversion technique on selected bit groups of a data block after its partitioning. The effectiveness of this approach hinges on how a data block is partitioned into bit groups. While all existing solutions can separate faults into different groups for error correction, they are inadequate on three fundamental capabilities desired for any partition scheme. First, it can maximize probability of successfully re-partitioning a block so that two faults currently in the same group are placed into two new groups. Second, it can partition a block into a small number of groups for space efficiency. Third, it should spread out faults across the groups as uniformly as possible, so that more faults can be accommodated within the same number of groups. A recovery solution with these capabilities can provide strong fault tolerance with minimal overhead.We propose Aegis, a recovery solution with a systematical partition scheme using fewer groups to accommodate more faults compared with state-of-the-art schemes. The uniqueness of Aegis's partition scheme lies on its guarantee that any two bits in the same group will not be in the same group after a re-partition. Empowered by the partition scheme, Aegis can recover significantly more faults with reduced space overhead relative to state-of-the-art solutions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.