Evgeni Krimer scite author profile

Abstract-While Moore's law scaling continues to double transistor density every technology generation, supply voltage reduction has essentially stopped, increasing both power density and total energy consumed in conventional microprocessors. Therefore, future processors will require an architecture that can: a) take advantage of the massive amount of transistors that will be available; and b) operate these transistors in the near-threshold supply domain, thereby achieving near optimal energy/computation by balancing the leakage and dynamic energy consumption. Unfortunately, this optimality is typically achieved while running at very low frequencies (i.e. 0.1 − 10MHz ) and with only one computation executing per cycle, such that performance is limited. Further, near-threshold designs suffer from severe process variability that can introduce extremely large delay variations. In this paper, we propose a near energy-optimal, stream processor family that relies on massively parallel, near-threshold VLSI circuits and interconnect, incorporating cooperative circuit/architecture techniques to tolerate the expected large delay variations. Initial estimations from circuit simulations show that it is possible to achieve greater than 1 Giga-Operations per second (1GOP/s) with less than 1mW total power consumption, enabling a new class of energy-constrained, high-throughput computing applications.

show abstract

Lane decoupling for improving the timing-error resiliency of wide-SIMD architectures

Krimer

Chiang

Erez

2012

SIGARCH Comput. Archit. News

View full text Add to dashboard Cite

A significant portion of the energy dissipated in modern integrated circuits is consumed by the overhead associated with timing guardbands that ensure reliable execution. Timing speculation, where the pipeline operates at an unsafe voltage with any rare errors detected and resolved by the architecture, has been demonstrated to significantly improve the energy-efficiency of scalar processor designs. Unfortunately, applying the same timing-speculative approach to wide-SIMD architectures, such as those used in highlyefficient GPUs, may not provide similar gains.In this work, we make two important contributions. The first is a set of models describing a parametrized general error probability function that is based on measurements of a fabricated chip and the expected efficiency benefits of timing speculation in a SIMD context. The second contribution is a decoupled SIMD pipeline that more effectively utilizes timing speculation and recovery, when compared with a standard SIMD design that uses only conventional timing speculation. The proposed lane decoupling enables each SIMD lane to tolerate timing errors independent of other adjacent lanes, resulting in higher throughput and improved scalability. We validate our models and evaluate our design using a cycle-based GPU simulator, describe the conditions where efficiency improvements can be obtained, and explore the benefits of decoupling across a wide range of parameters. Our results show that timing speculation can achieve up to 10.3% improvement in efficiency.

show abstract

Error Detection and Recovery Techniques for Variation-Aware CMOS Computing: A Comprehensive Review

Crop

Krimer

Moezzi-Madani

et al. 2011

JLPEA

View full text Add to dashboard Cite

While Moore's law scaling continues to double transistor density every technology generation, new design challenges are introduced. One of these challenges is variation, resulting in deviations in the behavior of transistors, most importantly in switching delays. These exaggerated delays widen the gap between the average and the worst case behavior of a circuit. Conventionally, circuits are designed to accommodate the worst case delay and are therefore becoming very limited in their performance advantages. Thus, allowing for an average case oriented design is a promising solution, maintaining the pace of performance improvement over future generations. However, to maintain correctness, such an approach will require on the fly mechanisms to prevent, detect, and resolve violations. This paper explores such mechanisms, allowing the improvement of circuit performance under intensifying variations. We present speculative error detection techniques along with recovery mechanisms. We continue by discussing their ability to operate under extreme variations including sub-threshold operation. While the main focus of this survey is on circuit J. Low Power Electron. Appl. 2011, 1 335 approaches, for its completeness, we discuss higher-level, architectural and algorithmic techniques as well.

show abstract

Packet-level static timing analysis for NoCs

Krimer

Erez

Keslassy

et al. 2009

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Evgeni Krimer

A 530mV 10-lane SIMD processor with variation resiliency in 45nm SOI

Synctium: a Near-Threshold Stream Processor for Energy-Constrained Parallel Applications

Lane decoupling for improving the timing-error resiliency of wide-SIMD architectures

Error Detection and Recovery Techniques for Variation-Aware CMOS Computing: A Comprehensive Review

Packet-level static timing analysis for NoCs

Contact Info

Product

Resources

About