Long interconnects are becoming an increasingly important problem from both power and performance perspectives. This motivates designers to adopt on-chip network-based communication infrastructures and three-dimensional (3D) designs where multiple device layers are stacked together. Considering the current trends towards increasing use of chip multiprocessing, it is timely to consider 3D chip multiprocessor design and memory networking issues, especially in the context of data management in large L2 caches. The overall goal of this paper is to study the challenges for L2 design and management in 3D chip multiprocessors. Our first contribution is to propose a router architecture and a topology design that makes use of a network architecture embedded into the L2 cache memory. Our second contribution is to demonstrate, through extensive experiments, that a 3D L2 memory architecture generates much better results than the conventional two-dimensional (2D) designs under different number of layers and vertical (inter-wafer) connections. In particular, our experiments show that a 3D architecture with no dynamic data migration generates better performance than a 2D architecture that employs data migration. This also helps reduce power consumption in L2 due to a reduced number of data movements.
Main memory contains transient information for all resident applications. However, if memory chip contents survives power-off, e.g., via freezing DRAM chips, sensitive data such as passwords and keys can be extracted. Main memory persistence will soon be the norm as recent advancements in MRAM and FeRAM position non-volatile memory technologies for widespread deployment in laptop, desktop, and embedded system main memory. Unfortunately, the same properties that provide energy efficiency, tolerance against power failure, and "instant-on" powerup also subject systems to offline memory scanning. In this paper, we propose a Memory Encryption Control Unit (MECU) that provides memory confidentiality during system suspend and across reboots. The MECU encrypts all memory transfers between the processor-local level 2 cache and main memory to ensure plaintext data is never written to the persistent medium. The MECU design is outlined and performance and security trade-offs considered. We evaluate a MECU-enhanced architecture using the SimpleScalar hardware simulation framework on several hardware benchmarks. This analysis shows the majority of memory accesses are delayed by less than 1 ns, with higher access latencies (caused by resume state reconstruction) subsiding within 0.25 seconds of a system resume. In effect, the MECU provides zero-cost steady state memory confidentiality for non-volatile main memory.
As technology scales, temperature in modern highperformance VLSI circuits has increased dramatically. Temperature affects not only the performance but also the power, reliability, and cost of the system. In this paper, we examine two approaches to reduce the on-chip temperature: one is to reduce the power densities of hot functional units by allocating more die area (growing units), and the other one is to use three different migrating computation (MC) techniques to migrate activities among duplicated functional units. Through the use of tools for architectural power and temperature estimation, the effectiveness of these two methods are evaluated for the Alpha 21264 microprocessor.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.