Continued scaling of NAND flash memory to smaller process technology nodes decreases its reliability, necessitating more sophisticated mechanisms to correctly read stored data values. To distinguish between different potential stored values, conventional techniques to read data from flash memory employ a single set of reference voltage values, which are determined based on the overall threshold voltage distribution of flash cells. Unfortunately, the phenomenon of program interference, in which a cell's threshold voltage unintentionally changes when a neighboring cell is programmed, makes this conventional approach increasingly inaccurate in determining the values of cells. This paper makes the new empirical observation that identifying the value stored in the immediate-neighbor cell makes it easier to determine the data value stored in the cell that is being read. We provide a detailed statistical and experimental characterization of threshold voltage distribution of flash memory cells conditional upon the immediate-neighbor cell values, and show that such conditional distributions can be used to determine a set of read reference voltages that lead to error rates much lower than when a single set of reference voltage values based on the overall distribution are used. Based on our analyses, we propose a new method for correcting errors in a flash memory page, neighborcell assisted correction (NAC). The key idea is to re-read a flash memory page that fails error correction codes (ECC) with the set of read reference voltage values corresponding to the conditional threshold voltage distribution assuming a neighbor cell value and use the re-read values to correct the cells that have neighbors with that value. Our simulations show that NAC effectively improves flash memory lifetime by 33% while having no (at nominal lifetime) or very modest (less than 5% at extended lifetime) performance overhead.
A register file is a critical component of a modernsuperscalar processor.It has a large number of entriesand read/write ports in order to enable high levels ofinstruction parallelism.As a result, the register file'sarea, access time, and energy consumption increasedramatically, significantly affecting the overallsuperscalar processor's performance and energyconsumption.This is especially true in 64-bitprocessors.This paper presents a new integer register fileorganization, which reduces energy consumption,area, and access time of the register file with a minimal effect on overall IPC.This is accomplished byexpoiting a new concept, partial value locality, whichis defined as occurence of mutiple live valueinstances identical in a subset of their bits.A possibleimplementation of the new register file is describedand shown to obtain proposed optimized register filedesigns.Overall, an energy reduction of over 50%, a18% decreas in area, and a 15% reduction in the accesstime are achieved in the new register file.Theenergy and area savings are achieved with a 1.7%reduction in IPC for integer applications and anegligible 0.3% in numerical applications, assumingthe same clock frequency.A performance increase ofup to 13% is possible if the clcok frequency can beincreases due to a reduction in the register file accesstime.This approach enables other, very promisingoptimizations, three of which are outlined in the paper.
Address translation is fundamental to processor performance. Prior work focused on reducing Translation Lookaside Buffer (TLB) misses to improve performance and energy, whereas we show that even TLB hits consume a significant amount of dynamic energy.To reduce the energy cost of address translation, we first propose Lite, a mechanism that monitors the performance and utility of L1 TLBs, and adaptively changes their sizes with way-disabling. The resulting TLB Lite organization opportunistically reduces the dynamic energy spent in address translation by 23% on average with minimal impact on TLB miss cycles. To further reduce the energy and performance overheads of L1 TLBs, we also propose RMM Lite that targets the recently proposed Redundant Memory Mappings (RMM) address-translation mechanism. RMM maps most of a process's address space with arbitrarily large ranges of contiguous pages in both virtual and physical address space using a modest number of entries in a range TLB. RMM Lite adds to RMM an L1-range TLB and the Lite mechanism. The high hit ratio of the L1-range TLB allows Lite to downsize the L1-page TLBs more aggressively. RMM Lite reduces the dynamic energy spent in address translation by 71% on average. Above the near-zero L2 TLB misses from RMM, RMM Lite further reduces the overhead from L1 TLB misses by 99%.These proposed designs target current and future energyefficient memory system design to meet the ever increasing memory demands of applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.