Device scaling has enabled continuous performance increase of integrated circuits. However, severe reliability and yield concerns are arising against the background of nanotechnology. Traditionally, most causes and countermeasures were solely considered manufacturing issues, but lately, we have seen a shift towards operational reliability issues. Though, besides intense research on soft-errors and system-level approaches very little effort is put into low-level design solutions in order to enhance lifetime reliability. Hence, we demonstrate that redundant transistor insertion does improve system reliability significantly as regards Time-Dependent Dielectric Breakdown (TDDB). Furthermore, we introduce an algorithm which identifies the transistors being most vulnerable to TDDB. Subsequently, redundant transistors (called shadow transistors) are inserted at the previously identified instances. Lastly, we argue for applying high threshold voltage devices for the redundant transistors. Finally, we present results for a set of benchmark circuits and prove the combined approach successful. The enhanced designs were on average 41.8 % more reliable compared to the initial designs in respect of TDDB at the price of moderately increased power consumption and delay.
Progressive technology scaling raises the need for efficient VLSI design methods facing the increasing vulnerability to permanent physical defects, while considering power efficiency of resulting circuit implementations at the same time. Triple Modular Redundancy (TMR) represents a common method to encounter reliability problems, but has the drawback of increased area and power consumption. This work introduces a Low Power Redundant (LPR) design solution that targets the power penalty of TMR implementations. This is done by enhanced and new functional runtime capabilities for error detection and operation control. By exploiting the inherent modularity and parallelism of TMR, the LPR solution applies additional control logic to switch dynamically between compare phases (to indicate faults and their locations) and parallel operation (with reduced operation frequency). This allows power optimized circuit operation with full support for the treatment of permanent faults. Simulation results on different ALU implementations show a decrease of power consumption of up to 60 % compared to conventional TMR. Furthermore, different strategies for the switching between operation modes are introduced that enable power efficient system operation in the presence of permanent physical defects. Moreover, significant reliability improvements are also achieved due to the adaptive use of the redundant modules.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.