The impact of NBTI on the performance of combinational and sequential circuits

Wang, Wenping; Yang, Shengqi; Bhardwaj, Sarvesh; Vattikonda, Rakesh; Vrudhula, Sarma; Liu, Frank; Cao, Yu

doi:10.1145/1278480.1278573

Cited by 115 publications

(13 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Both of them can gradually degrade performance over time. The researchers have evidence that circuit path delay can increase by 10% during the five-year lifetime [10]. Even worse, with technology scaling to the nano-scale, the transistors tend to become more vulnerable and more prone to aging impacts [8].…”

Section: "Sick Silicon" Problemmentioning

confidence: 99%

Fault tolerance on-chip: a reliable computing paradigm using self-test, self-diagnosis, and self-repair (3S) approach

Yan

et al. 2018

Sci. China Inf. Sci.

View full text Add to dashboard Cite

If your computer crashes, you can revive it by a reboot, an empirical solution that usually turns out to be effective. The rationale behind this solution is that transient faults, either in hardware or software, can be fixed by refreshing the machine state. Such a "silver bullet", however, could be futile in the future because the faults, especially those existing in the hardware such as Integrated Circuit (IC) chips, cannot be eliminated by refreshing. What we need is a more sophisticated mechanism to steer the system back to the right track. The "magic cure" is the Fault Tolerance On-Chip (FTOC) mechanism, which relies on a suite of built-in design-for-reliability logic, including fault detection, fault diagnosis, and error recovery, working in a self-supportive manner. We have exploited the FTOC to build a holistic solution ranging from on-chip fault detection to error recovery mechanisms to address faults caused by chips progressively aging. Besides fault detection, the FTOC paradigm provides attractive benefits, such as facilitating graceful performance degradation, mitigating the impact of verification blind spots, and improving the chip yield.

show abstract

Section: "Sick Silicon" Problemmentioning

confidence: 99%

Fault tolerance on-chip: a reliable computing paradigm using self-test, self-diagnosis, and self-repair (3S) approach

Yan

et al. 2018

Sci. China Inf. Sci.

View full text Add to dashboard Cite

show abstract

“…Among various wearout phenomena, BTI is one of the most dominant mechanisms [1], [18], which increases the absolute value of threshold voltage (Vth) of transistors over time under stress (such as voltage stress), thus increasing the circuit delay and shortening circuit lifetime [5], [18] (PBTI) affects NMOS transistors that are under positive stress voltage. Although the effect of PBTI has been negligible in previous technologies, it is rapidly becoming an important reliability issue with the introduction of high-k and metal gates [19].…”

Section: A Bti Wearout and Recovery Basicsmentioning

confidence: 99%

“…The never-ending demand for higher performance and lower power consumption pushes the aggressive technology scaling and the appearance of emerging devices, while further downscaling leads to major challenges, among which wearout (or aging) has become a huge reliability threat. Bias Temperature Instability (BTI) has been accepted as one of the most dominant wearout factors causing lifetime reliability problems in the front-end of line (FEOL) by worsening metrics across the digital system hierarchy [1]- [4], with performance degradation or intrinsic faults at the circuit level [5], errors at the architecture level [3] and failures at the system level [6]. Thus, dealing with wearout issues (such as BTI) needs to cross layers, where various techniques are necessary to be implemented -from device level up to the application levelto work together to achieve the optimal lifetime and acceptable wearout levels with a low cost [2], [4], [7], [8].…”

Section: Introductionmentioning

confidence: 99%

Implications of accelerated self-healing as a key design knob for cross-layer resilience

Guo

Stan

2017

Integration

View full text Add to dashboard Cite

In this paper we propose a cross-layer accelerated self-healing (CLASH) system which "repairs" its wearout issues in a physical sense through accelerated and active recovery, by which wearout can be reversed while actively applying several accelerated self-healing techniques, such as high temperature and negative voltages. Different from previous solutions of coping with wearout issues (e.g. BTI) by "tolerating", "slowing down" or "compensating", which still leave the irreversible (permanent) wearout component unchecked, the proposed solution is able to fully avoid the irreversible wearout through periodic rejuvenation, and this is inspired by the explored frequency dependent behaviors of wearout and (accelerated and active) recovery based on measurements on FPGAs. We demonstrate a case where the chip can always be brought back to the fresh status by employing a pattern of 31-hour regular operation (under room temperature and nominal voltage) followed by a 1-hour accelerated selfhealing (under high temperature and negative voltage). The proposed system integrates the notions of accelerated self-healing across multiple layers of the system stack. At the circuit level, a negative voltage generator and heating elements are designed and implemented; at the architecture level, the core can be allocated in a way such that the dark silicon or redundant resources can be healed by active elements; at the system level, right balance of stress and accelerated/active recovery can be employed by the system scheduler to fully mitigate the wearout; various wearout sensors act as the media between different layers. Overall, these techniques work together to guarantee that the whole system performs for more of the time at higher levels of performance and power efficiency by fully taking advantage of the extra opportunities enabled by the accelerated self-healing.

show abstract

“…Bias temperature instability (BTI), hot-carrier injection and gate-oxide wearout are the primary aging mechanisms for CMOS devices. [2][3][4] The negative bias temperature instability (NBTI) for pMOS devices are one of the most prominent and persistent threats for future technologies. NBTI will cause an increase in the threshold voltage (V th ) of the pMOS devices when negative voltage is applied at the gate (logic \0").…”

Section: Introductionmentioning

confidence: 99%

Low Power Aging-Aware On-Chip Memory Structure Design by Duty Cycle Balancing

Wang

Jin

Zheng

et al. 2016

J CIRCUIT SYST COMP

View full text Add to dashboard Cite

The degradation of CMOS devices over the lifetime can cause severe threat to the system performance and reliability at deep submicron semiconductor technologies. The negative bias temperature instability (NBTI) is among the most important sources of the aging mechanisms. Applying the traditional guardbanding technique to address the decreased speed of devices is too costly. On-chip memory structures, such as register files and on-chip caches, suffer a very high NBTI stress. In this paper, we propose the aging-aware design to combat the NBTI-induced aging in integer register files, data caches and instruction caches in high-performance microprocessors. The proposed aging-aware design can mitigate the negative aging effects by balancing the duty cycle ratio of the internal bits in on-chip memory structures. Besides the aging problem, the power consumption is also one of the most prominent issues in microprocessor design. Therefore, we further propose to apply the low power schemes to different memory structures under aging-aware design. The proposed low power aging-aware design can also achieve a significant power reduction, which will further reduce the temperature and NBTI degradation of the on-chip memory structures. Our experimental results show that our aging-aware design can effectively reduce the NBTI stress with 30.8%, 64.5% and 72.0% power saving for the integer register file, data cache and instruction cache, respectively.

show abstract

The impact of NBTI on the performance of combinational and sequential circuits

Cited by 115 publications

References 14 publications

Fault tolerance on-chip: a reliable computing paradigm using self-test, self-diagnosis, and self-repair (3S) approach

Fault tolerance on-chip: a reliable computing paradigm using self-test, self-diagnosis, and self-repair (3S) approach

Implications of accelerated self-healing as a key design knob for cross-layer resilience

Low Power Aging-Aware On-Chip Memory Structure Design by Duty Cycle Balancing

Contact Info

Product

Resources

About