Highly reliable solid-state drives (SSDs) with triplelevel-cell (TLC) NAND flash and Advanced Error-Prediction Low-Density Parity-Check (AEP-LDPC) are proposed. To increase NAND flash's capacity, bits/cell have been doubled and tripled, which causes reliability to drastically degrade due to narrower V TH margins. Previously proposed Error-Prediction LDPC (EP-LDPC) error-correcting code (ECC) improved reliability for Multi-Level-Cell (MLC) NAND flash [4]. However, in EP-LDPC program disturb is not modeled, so precision is limited, especially in short data retention < 2 days. Here, AEP-LDPC is proposed for TLC NAND flash. By considering effects of program disturb, data retention and floating-gate capacitive coupling, the most accurate SSDs can be realized, with high speed read capability. The SSD's data-retention time increases by more than 12×, decode iterations decrease 57%, and acceptable TLC NAND BER increases by more than 2.8 ×.Index Terms-Error correcting code (ECC), low density parity check code (LDPC), triple level cell (TLC) NAND flash, memory controller, reliability, solid state drive (SSD)
A hybrid storage architecture of ReRAM and TLC (3b/cell) NAND Flash with RAID-5/6 is developed to meet cloud data-center requirements of reliability, speed and capacity. The storage controller enhances reliability and performance through five techniques with minimal area overhead. The first three approaches, (i) flexible R Ref (FR), (ii) adaptive asymmetric coding (AAC), and (iii) verify trials reduction (VTR), are applied to 50nm ReRAM to improve the bit-error rate (BER) by 69% and performance by 97%. Techniques (iv) balanced bits/cell optimization (BCO) are applied to 2Xnm TLC NAND to reduce the failure rate by 98% and extend the lifetime (write/erase (W/E) cycles) by >22×, respectively.Conventionally, in hybrid ReRAM/MLC (2bits/cell) NAND, high-speed ReRAM is paired in small ratios with high capacity NAND, and data is allocated based on access frequency (i.e., frequently written hot data to ReRAM, cold data to NAND) and data size (i.e., fragmented random data to ReRAM, sequential data to NAND) to enhance the overall system performance, reliability and power [1]. Exchangeable TLC/MLC NAND storage arrays have been proposed [2], as well as application of duplicate data requiring RAID-1 to MLC NAND, for reliability improvement in enterprise servers [3].This work presents hybrid storage of ReRAM and TLC NAND Flash with costeffective RAID-5/6 ( Fig. 19.6.1). RAID-5/6 is widely used in cloud storage and data warehouses due to its lower parity overhead (<10%) compared to RAID-1. Our architecture provides significant improvements in the reliability and performance of ReRAM and TLC NAND. Data to ReRAM is encoded by AAC and then written to ReRAM with VTR. During read, FR determines the optimum read reference resistance, R Ref , to minimize the BER. In the NAND, balanced RAID-5/6 evenly allocates the data among the different page types in order to minimize the worst case RAID failure rate, and BCO decides the mode (TLC/MLC/SLC) to extend the NAND chip's lifetime. Figure 19.6.2 describes FR in ReRAM, based on a 50nm, 64Mb Al x O y prototype, in which verify programming is applied on write units of 1Kb. After each set/reset pulse, verify read checks that the resistances satisfy the threshold levels (
An enterprise-grade SSD with TLC (3b/cell) NAND Flash is presented with three techniques that achieve high speed and high reliability. Quick low-density parity-check (LDPC) reduces the read latency of 1Xnm TLC NAND Flash SSD by 83%. Dynamic V TH optimization and auto data recovery reduce the NAND Flash bit-error rate (BER) by 80% and 18%, respectively. These techniques can be implemented in the SSD controller without circuit overhead. No modification is required to the TLC NAND flash.Enterprise storage demands fast speed and good reliability with high-density for big data applications. Though TLC NAND Flash SSD has the bit cost advantage over MLC (2b/cell) NAND Flash SSD, the adoption in the enterprise market is limited due to its poor speed and reliability. Real-time online analytical processing (OLAP) applications require a quick response from SSD. In real-world workloads, temporal data locality causes read requests to concentrate on the same memory cells. The frequently read data (hot data) suffers from the read disturb while cold data fail due to the data retention. To overcome performance and reliability problems of TLC NAND flash SSD, this paper describes three techniques shown in Fig. 7.7.1.To efficiently correct errors with a short latency, this paper presents quick LDPC. The read latency is 83% lower than advanced error-prediction LDPC (AEP-LDPC) error-correcting code (ECC) [1]. When memory cells wear-out and errors exceed the ECC capability, dynamic V TH optimization adaptively selects the optimal read reference voltage (V Ref ) and increases the V TH read margin. As a result, measured errors are reduced by 80%. Auto data recovery compensates the V TH decrease (the data retention error) with the V TH increase (the read disturb error).First, the LDPC is shown in Fig. 7.7.2. Figure 7.7.3 shows the total read latency and the measured reliability. In the enterprise MLC NAND Flash SSD, fast BCH ECC [2] is used. The error-correction capability of BCH is not sufficient for the enterprise use of TLC NAND Flash because enterprise storage requires higher endurance than consumer storage. The soft-decoding LDPC [3] corrects more errors than BCH ECC. However, it needs analog V TH to calculate the loglikelihood ratio (LLR) and 49-time V Ref sensing is necessary and the read latency increases to 2.3ms. V Ref sensing is defined as sensing a memory cell with one of the reference level. The conventional AEP-LDPC estimates LLR by the harddecision (digital) V TH , the write/erase (W/E) cycle, the retention time and intercell coupling information. Since the analog V TH is not used, the read latency decreases to 1ms. Yet, AEP-LDPC is still 7× slower than BCH ECC. In AEP-LDPC, 21-time V Ref sensing are required to read neighboring cell data in both wordline and bitline directions.To accelerate the read while securing the high reliability, the quick LDPC reads only one of upper/middle/lower pages, corresponding to 2 to 3V Ref sensing. The total read latency is 173μs, which is comparable with the latency of BCH ECC of 146μs. Th...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.