Abstract-While Moore's law scaling continues to double transistor density every technology generation, supply voltage reduction has essentially stopped, increasing both power density and total energy consumed in conventional microprocessors. Therefore, future processors will require an architecture that can: a) take advantage of the massive amount of transistors that will be available; and b) operate these transistors in the near-threshold supply domain, thereby achieving near optimal energy/computation by balancing the leakage and dynamic energy consumption. Unfortunately, this optimality is typically achieved while running at very low frequencies (i.e. 0.1 − 10MHz ) and with only one computation executing per cycle, such that performance is limited. Further, near-threshold designs suffer from severe process variability that can introduce extremely large delay variations. In this paper, we propose a near energy-optimal, stream processor family that relies on massively parallel, near-threshold VLSI circuits and interconnect, incorporating cooperative circuit/architecture techniques to tolerate the expected large delay variations. Initial estimations from circuit simulations show that it is possible to achieve greater than 1 Giga-Operations per second (1GOP/s) with less than 1mW total power consumption, enabling a new class of energy-constrained, high-throughput computing applications.
A significant portion of the energy dissipated in modern integrated circuits is consumed by the overhead associated with timing guardbands that ensure reliable execution. Timing speculation, where the pipeline operates at an unsafe voltage with any rare errors detected and resolved by the architecture, has been demonstrated to significantly improve the energy-efficiency of scalar processor designs. Unfortunately, applying the same timing-speculative approach to wide-SIMD architectures, such as those used in highlyefficient GPUs, may not provide similar gains.In this work, we make two important contributions. The first is a set of models describing a parametrized general error probability function that is based on measurements of a fabricated chip and the expected efficiency benefits of timing speculation in a SIMD context. The second contribution is a decoupled SIMD pipeline that more effectively utilizes timing speculation and recovery, when compared with a standard SIMD design that uses only conventional timing speculation. The proposed lane decoupling enables each SIMD lane to tolerate timing errors independent of other adjacent lanes, resulting in higher throughput and improved scalability. We validate our models and evaluate our design using a cycle-based GPU simulator, describe the conditions where efficiency improvements can be obtained, and explore the benefits of decoupling across a wide range of parameters. Our results show that timing speculation can achieve up to 10.3% improvement in efficiency.
While Moore's law scaling continues to double transistor density every technology generation, new design challenges are introduced. One of these challenges is variation, resulting in deviations in the behavior of transistors, most importantly in switching delays. These exaggerated delays widen the gap between the average and the worst case behavior of a circuit. Conventionally, circuits are designed to accommodate the worst case delay and are therefore becoming very limited in their performance advantages. Thus, allowing for an average case oriented design is a promising solution, maintaining the pace of performance improvement over future generations. However, to maintain correctness, such an approach will require on the fly mechanisms to prevent, detect, and resolve violations. This paper explores such mechanisms, allowing the improvement of circuit performance under intensifying variations. We present speculative error detection techniques along with recovery mechanisms. We continue by discussing their ability to operate under extreme variations including sub-threshold operation. While the main focus of this survey is on circuit J. Low Power Electron. Appl. 2011, 1 335 approaches, for its completeness, we discuss higher-level, architectural and algorithmic techniques as well.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.