Caches are known to consume a large part of total microprocessor power. Traditionally, voltage scaling has been used to reduce both dynamic and leakage power in caches. However, aggressive voltage reduction causes process-variation--induced failures in cache SRAM arrays, which compromise cache reliability. In this article, we propose FFT-Cache, a flexible fault-tolerant cache that uses a flexible defect map to configure its architecture to achieve significant reduction in energy consumption through aggressive voltage scaling while maintaining high error reliability. FFT-Cache uses a portion of faulty cache blocks as redundancy—using block-level or line-level replication within or between sets—to tolerate other faulty caches lines and blocks. Our configuration algorithm categorizes the cache lines based on degree of conflict between their blocks to reduce the granularity of redundancy replacement. FFT-Cache thereby sacrifices a minimal number of cache lines to avoid impacting performance while tolerating the maximum amount of defects. Our experimental results on a processor executing SPEC2K benchmarks demonstrate that the operational voltage of both L1/L2 caches can be reduced down to 375 mV, which achieves up to 80% reduction in the dynamic power and up to 48% reduction in the leakage power. This comes with only a small performance loss (<%5) and 13% area overhead.
With memories continuing to dominate the area, power, cost and performance of a design, there is a critical need to provision reliable, high-performance memory bandwidth for emerging applications. Memories are susceptible to degradation and failures from a wide range of manufacturing, operational and environmental effects, requiring a multi-layer hardware/software approach that can tolerate, adapt and even opportunistically exploit such effects. The overall memory hierarchy is also highly vulnerable to the adverse effects of variability and operational stress. After reviewing the major memory degradation and failure modes, this paper describes the challenges for dependability across the memory hierarchy, and outlines research efforts to achieve multi-layer memory resilience using a hardware/software approach. Two specific exemplars are used to illustrate multilayer memory resilience: first we describe static and dynamic policies to achieve energy savings in caches using aggressive voltage scaling combined with disabling faulty blocks; and second we show how software characteristics can be exposed to the architecture in order to mitigate the aging of large register files in GPGPUs. These approaches can further benefit from semantic retention of application intent to enhance memory dependability across multiple abstraction levels, including applications, compilers, run-time systems, and hardware platforms.
Fault-Tolerant Voltage-Scalable (FTVS) SRAM cache architectures are a promising approach to improve energy efficiency of memories in the presence of nanoscale process variation. Complex FTVS schemes are commonly proposed to achieve very low minimum supply voltages, but these can suffer from high overheads and thus do not always offer the best power/capacity trade-offs. We observe on our 45nm test chips that the "fault inclusion property" can enable lightweight fault maps that support multiple runtime supply voltages. Based on this observation, we propose a simple and low-overhead FTVS cache architecture for power/capacity scaling. Our mechanism combines multilevel voltage scaling with optional architectural support for power gating of blocks as they become faulty at low voltages. A static (SPCS) policy sets the runtime cache VDD once such that a only a few cache blocks may be faulty in order to minimize the impact on performance. We describe a Static Power/Capacity Scaling (SPCS) policy and two alternate Dynamic Power/Capacity Scaling (DPCS) policies that opportunistically reduce the cache voltage even further for more energy savings. This architecture achieves lower static power for all effective cache capacities than a recent more complex FTVS scheme. This is due to significantly lower overheads, despite the inability of our approach to match the min-VDD of the competing work at a fixed target yield. Over a set of SPEC CPU2006 benchmarks on two system configurations, the average total cache (system) energy saved by SPCS is 62% (22%), while the two DPCS policies achieve roughly similar energy reduction, around 79% (26%). On average, the DPCS approaches incur 2.24% performance and 6% area penalties.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.