Leakage power is a major concern in current and future microprocessor designs. In this paper, we explore the potential of architectural techniques to reduce leakage through power-gating of execution units. This paper first develops parameterized analytical equations that estimate the break-even point for application of power-gating techniques. The potential for power gating execution units is then evaluated, for the range of relevant break-even points determined by the analytical equations, using a state-of-the-art out-of-order superscalar processor model. The power gating potential of the floating-point and fixed-point units of this processor is then evaluated using three different techniques to detect opportunities for entering sleep mode; ideal, time-based, and branch-misprediction-guided. Our results show that using the time-based approach, floating-point units can be put to sleep for up to 28% of the execution cycles at a performance loss of 2%. For the more difficult to power-gate fixed-point units, the branch misprediction guided technique allows the fixed-point units to be put to sleep for up to 40% more of the execution cycles compared to the simpler time-based technique, with similar performance impact. Overall, our experiments demonstrate that architectural techniques can be used effectively in power-gating execution units.
Victor V. Zyuban When it comes to performance, modern computer design has become a well structured art which starts with instruction sets that maximize opportunities for concurrency, follows through with fast organizational techniques such as pipelining and super scalar execution, and ends with clever macro and circuit designs that are based on inherently fast CMOS fabrication technologies. When it comes to low power, however, exactly the opposite is true. Current techniques start with lowering supply voltages and making process changes to minimize capacitance, followed by some relatively simple techniques for minimizing power for particular logic macros, and then utilizing relatively ad hoc techniques, such as 'sleep modes', at higher levels. This work attempts to reverse this by bringing the power issue to the earliest phase of high-performance microprocessor development. We propose a methodology for power-optimization of high-performance microprocessors at the microarchitecture level. In particular, our work explores solutions to the problem that do not compromise performance. First, major targets for power reduction are identified within microarchitecture, where power is heavily consumed, or will be heavily consumed in next-generation processors. This involves developing energy models for structures where power grows with increasing issue width, such as Register File, Issue Window, Memory Disambiguation Unit, etc. Then, a multicluster microar-CHAPTER 6: IMPLEMENTATION OF THE ENERGY-EFFICIENT MULTI-CLUSTER ARCHITECTURE .
Balancing hardware intensity in microprocessor pipelinesThe evaluation of architectural tradeoffs is complicated by implications in the circuit domain which are typically not captured in the analysis but substantially affect the results. We propose a metric of hardware intensity (), which is useful for evaluating issues that affect both circuits and architecture. Analyzing data for actual designs, we show how to measure the introduced parameters and discuss variations between observed results and common theoretical assumptions. For a power-efficient design, we derive relations for and supply voltage V under progressively more general situations and illustrate the use of these equations in simple examples. Then we establish a relation between the architectural energyefficiency metric and hardware intensity, and we derive expressions for evaluating the effect of modifications at the microarchitectural level on processor frequency and power, assuming the optimal tuning of the pipeline. These relations will guide the architect to achieve an energy-optimal balance between architectural complexity and hardware intensity.
Microarchitectural redundancy has been proposed as a means of improving chip lifetime reliability. It is typically used in a reactive way, allowing chips to maintain operability in the presence of failures by detecting and isolating, correcting, and/or replacing components on a first-come, first-served basis only after they become faulty. In this paper, we explore an alternative, more preferred method of exploiting microarchitectural redundancy to enhance chip lifetime reliability. In our proposed approach, redundancy is used proactively to allow non-faulty microarchitecture components to be temporarily deactivated, on a rotating basis, to suspend and/or recover from certain wearout effects. This approach improves chip lifetime reliability by warding off the onset of wearout failures as opposed to reacting to them posteriorly. Applied to on-chip cache SRAM for combating NBTI-induced wearout failure, our proactive wearout recovery approach increases lifetime reliability (measured in mean-time-to-failure) of the cache by about a factor of seven relative to no use of microarchitectural redundancy and a factor of five relative to conventional reactive use of redundancy having similar area overhead.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.