“…This is achieved by using low devices in the critical sparse-tree and high devices on the noncritical sum-generator paths. The 10 differential in leakage currents between high and low devices [4] results in an overall 56% reduction in leakage energy consumption without impacting performance ( Table I). Note that highallocation was performed on an initial all-low-design without transistor resizing.…”
Section: A Dualdesignmentioning
confidence: 99%
“…Thus, the output of the compressor array is a pair of 32-bit numbers that represent the carry-save format of the effective address. This phase of address computation includes the latch data-to-delay, shifter delay, 3 : 2 compressor delay and setup time at the adder core inputs and takes 98 ps in a 1.2-V 130-nm technology [4]. During this phase, the adder core will be in the precharge state.…”
Abstract-This paper describes a 32-bit address generation unit designed for 4-GHz operation in 1.2-V 130-nm technology. The AGU utilizes a 152-ps sparse-tree adder core to achieve 20% delay reduction, 80% lower interconnect complexity, and a low (1%) active energy leakage component. The dual-T semidynamic implementation of the adder core provides the performance of a dynamic CMOS design with an average energy profile similar to static CMOS, enabling 71% savings in average energy with a good sub-130-nm scaling trend.Index Terms-Address generation unit (AGU), high-performance adders, semidynamic design, sparse-tree adder.
“…This is achieved by using low devices in the critical sparse-tree and high devices on the noncritical sum-generator paths. The 10 differential in leakage currents between high and low devices [4] results in an overall 56% reduction in leakage energy consumption without impacting performance ( Table I). Note that highallocation was performed on an initial all-low-design without transistor resizing.…”
Section: A Dualdesignmentioning
confidence: 99%
“…Thus, the output of the compressor array is a pair of 32-bit numbers that represent the carry-save format of the effective address. This phase of address computation includes the latch data-to-delay, shifter delay, 3 : 2 compressor delay and setup time at the adder core inputs and takes 98 ps in a 1.2-V 130-nm technology [4]. During this phase, the adder core will be in the precharge state.…”
Abstract-This paper describes a 32-bit address generation unit designed for 4-GHz operation in 1.2-V 130-nm technology. The AGU utilizes a 152-ps sparse-tree adder core to achieve 20% delay reduction, 80% lower interconnect complexity, and a low (1%) active energy leakage component. The dual-T semidynamic implementation of the adder core provides the performance of a dynamic CMOS design with an average energy profile similar to static CMOS, enabling 71% savings in average energy with a good sub-130-nm scaling trend.Index Terms-Address generation unit (AGU), high-performance adders, semidynamic design, sparse-tree adder.
“…other known techniques to reduce leakage during standby mode but in this paper we focus on runtime leakage reduction which is a more difficult and pressing problem. Currently dual-V th is the only practical approach to achieving substantial runtime leakage reduction [9].…”
Power consumption, particularly runtime leakage, in long on-chip buses has grown to an unacceptable portion of the total power budget due to heavy buffer insertion to combat RC delays. In this paper, we propose a new bus encoding algorithm and circuit scheme for on-chip buses that eliminates capacitive crosstalk while simultaneously reducing total power. We introduce a new buffer design approach with selective use of high threshold voltage transistors and couple this buffer design with a novel bus encoding scheme. The proposed encoding scheme significantly reduces total power by 26% and runtime leakage power by 42% while also eliminating capacitive crosstalk. In addition, the proposed encoding is specifically optimized to reduce the complexity of the encoding logic, allowing for a significant reduction in overhead which has not been considered in previous bus encoding work.
“…All simulations are performed at a temperature of 100C. A typical global metal layer for a 0.13µm technology node [9] is used for routing the bus, with a minimum pitch of 1.2µm (Fig. 9).…”
We propose various low-latency spatial encoder circuits based on bus-invert coding for reducing peak energy and current in on-chip buses with minimum penalty on total latency. The encoders are implemented in dual-rail domino logic with interfaces for static inputs and static buses. A spatial and temporally encoded dynamic bus technique is also proposed for higher performance targets. Comparisons to standard on-chip buses of various lengths with optimal repeater configurations at the 130nm node show the energy-delay and peak current-delay design space in which the different encoder circuits are beneficial. A 9mm spatially encoded static bus exhibits peak energy gains beyond that achievable through repeater optimization for a single cycle operation at 1GHz, with delay and energy overhead of the encoding included. For throughput constrained buses, the spatially encoded static bus can provide up to 31% reduction in peak energy, while the spatially and temporally encoded dynamic bus yields peak current reductions of more than 50% for all bus lengths.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.