A case for exploiting subarray-level parallelism (SALP) in DRAM

Kim, Ki Wook; Seshadri, Krishna G; Lee, Myeong Soo; Liu, Ming; Mutlu,

doi:10.1109/isca.2012.6237032

Cited by 195 publications

(111 citation statements)

References 42 publications

Supporting

Mentioning

110

Contrasting

Order By: Relevance

“…This allows the row's data (in the form of charge) to be transferred into the row-buffer shown in Figure 1a. Better known as sense-amplifiers, the row-buffer reads out the charge from the cells -a process that destroys the data in [38,41,43]. Subsequently, all accesses to the row are served by the row-buffer on behalf of the row.…”

Section: Low-level Organizationmentioning

confidence: 99%

“…Using a cycle-accurate DRAM simulator, we evaluate PARA's performance impact on 29 single-threaded workloads from SPEC CPU2006, TPC, and memory-intensive microbenchmarks (We assume a reasonable system setup [41] with a 4GHz out-of-order core and dual-channel DDR3-1600.) Due to re-mapping, we conservatively assume that a row can have up to ten different rows as neighbors, not just two.…”

Section: Seventh Solution: Paramentioning

confidence: 99%

See 1 more Smart Citation

Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors

Kim¹,

Daly²,

Kim³

et al. 2014

2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA)

Self Cite

404

1,006

View full text Add to dashboard Cite

Abstract. Memory isolation is a key property of a reliable and secure computing system -an access to one memory address should not have unintended side effects on data stored in other addresses. However, as DRAM process technology scales down to smaller dimensions, it becomes more difficult to prevent DRAM cells from electrically interacting with each other. In this paper, we expose the vulnerability of commodity DRAM chips to disturbance errors. By reading from the same address in DRAM, we show that it is possible to corrupt data in nearby addresses. More specifically, activating the same row in DRAM corrupts data in nearby rows. We demonstrate this phenomenon on Intel and AMD systems using a malicious program that generates many DRAM accesses. We induce errors in most DRAM modules (110 out of 129) from three major DRAM manufacturers. From this we conclude that many deployed systems are likely to be at risk. We identify the root cause of disturbance errors as the repeated toggling of a DRAM row's wordline, which stresses inter-cell coupling effects that accelerate charge leakage from nearby rows. We provide an extensive characterization study of disturbance errors and their behavior using an FPGA-based testing platform. Among our key findings, we show that (i) it takes as few as 139K accesses to induce an error and (ii) up to one in every 1.7K cells is susceptible to errors. After examining various potential ways of addressing the problem, we propose a low-overhead solution to prevent the errors.

show abstract

Section: Low-level Organizationmentioning

confidence: 99%

Section: Seventh Solution: Paramentioning

confidence: 99%

Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors

Kim¹,

Daly²,

Kim³

et al. 2014

2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA)

Self Cite

404

1,006

View full text Add to dashboard Cite

show abstract

“…Commodity DDR3 (2007) [14]; DDR4 (2012) [18] Low-Power LPDDR3 (2012) [17]; LPDDR4 (2014) [20] Graphics GDDR5 (2009) [15] Performance eDRAM [28], [32]; RLDRAM3 (2011) [29] 3D-Stacked WIO (2011) [16]; WIO2 (2014) [21]; MCDRAM (2015) [13]; HBM (2013) [19]; HMC1.0 (2013) [10]; HMC1.1 (2014) [11] Academic SBA/SSA (2010) [38]; Staged Reads (2012) [8]; RAIDR (2012) [27]; SALP (2012) [24]; TL-DRAM (2013) [26]; RowClone (2013) [37]; Half-DRAM (2014) [39]; Row-Buffer Decoupling (2014) [33]; SARP (2014) [6]; AL-DRAM (2015) [25] At the forefront of such innovations should be DRAM simulators, the software tool with which to evaluate the strengths and weaknesses of each new proposal. However, DRAM simulators have been lagging behind the rapid-fire changes to DRAM.…”

Section: Segment Dram Standards and Architecturesmentioning

confidence: 99%

“…As listed in Table 1, some were evolutionary upgrades to existing standards (e.g., DDR4, LPDDR4), while some were pioneering implementations of die-stacking (e.g., WIO, HMC, HBM), and still others were academic research projects in experimental stages (e.g., Udipi et al [38], Kim et al [24]). …”

Section: Introductionmentioning

confidence: 99%

Ramulator: A Fast and Extensible DRAM Simulator

Kim

Yang

Mutlu

2016

IEEE Comput. Arch. Lett.

Self Cite

457

220

View full text Add to dashboard Cite

Abstract-Recently, both industry and academia have proposed many different roadmaps for the future of DRAM. Consequently, there is a growing need for an extensible DRAM simulator, which can be easily modified to judge the merits of today's DRAM standards as well as those of tomorrow. In this paper, we present Ramulator, a fast and cycle-accurate DRAM simulator that is built from the ground up for extensibility. Unlike existing simulators, Ramulator is based on a generalized template for modeling a DRAM system, which is only later infused with the specific details of a DRAM standard. Thanks to such a decoupled and modular design, Ramulator is able to provide out-of-the-box support for a wide array of DRAM standards: DDR3/4, LPDDR3/4, GDDR5, WIO1/2, HBM, as well as some academic proposals (SALP, AL-DRAM, TL-DRAM, RowClone, and SARP). Importantly, Ramulator does not sacrifice simulation speed to gain extensibility: according to our evaluations, Ramulator is 2.5× faster than the next fastest simulator. Ramulator is released under the permissive BSD license.

show abstract

“…DRAM represents each bit of memory using a single transistor and capacitor, organizing these memory cells in in two-dimensional arrays (banks) to amortize control overheads. Each bank is sub-divided into 512 × 512 cell subarrays and all data within neighboring subarrays are connected to one or more neighboring data pins for efficiency [13,14,15,16].…”

Section: Drammentioning

confidence: 99%