Abstract-The technique of module-threading utilizes standard DDR DRAM components to build modified memory modules. These modified modules incorporate one or more additional control signals. The modification permits the module to operate at higher performance levels and at lower power levels than standard modules. The modified modules are also capable of finer granularity transactions while still operating at full bandwidth.Index Terms-CMOS memory integrated circuits, Distributed memory systems, Memory management, Memory architecture, MOS memory integrated circuits, MOSFET memory integrated circuits, Shared memory systems,
Cryogenic, superconducting digital processors offer the promise of greatly reduced operating power for server-class computing systems. This is due to the exceptionally low energy per operation of Single Flux Quantum circuits built from Josephson junction devices operating at the temperature of 4 Kelvin. Unfortunately, no suitable same-temperature memory technology yet exists to complement these SFQ logic technologies. Possible memory technologies are in the early stages of development but will take years to reach the cost per bit and capacity capabilities of current semiconductor memory. We discuss the pros and cons of four alternative memory architectures that could be coupled to SFQ-based processors. Our feasibility studies indicate that cold memories built from CMOS DRAM and operating at 77K can support superconducting processors at low cost-per-bit, and that they can do so today.
Rapidly evolving workloads and exploding data volumes place great pressure on data-center compute, IO, and memory performance, and especially on memory capacity. Increasing memory capacity requires a commensurate reduction in memory cost per bit. DRAM technology scaling has been steadily delivering affordable capacity increases, but DRAM scaling is rapidly reaching physical limits. Other technologies such as Flash, enhanced Flash, Phase Change Memory, and Spin Torque Transfer Magnetic RAM hold promise for creating high capacity memories at lower cost per bit. However, these technologies have attributes that require careful management. We propose a hybrid DIMM architecture that uses a hardwaremanaged DRAM in front of enhanced Flash, which has much lower read latencies than conventional Flash. We explore the design space of such SCM devices in the context of different technology parameters, evaluating performance and endurance for data-center workloads. Our hybrid memory architecture is commercially realizable and can use standard DIMM form factors, giving it a low barrier to market entry. We find that for workloads like media streaming, enhanced Flash can be combined with DRAM to enable 88% of the performance of a DRAM-only system of the same capacity at 23% of the cost, even when factoring in replacement costs due to wear-out. The bottom line is that cost per performance is a factor of 3.8 better than DRAM. 1. INTRODUCTION Data-center servers struggle to keep up with rapidly evolving workloads and exploding data volumes. For many Big Data workloads, DRAM capacity is as important as compute, IO, and
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.