High-performance two-phase micropipeline building blocks: double edge-triggered latches and burst-mode select and toggle circuits

Yun, K.Y.; Beerel, Peter A.; Arceo, J.

doi:10.1049/ip-cds:19960709

Cited by 18 publications

(3 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The multiplier is designed as a two-stage micropipeline interfaced with the external environment through a pair of request/acknowledge 2-phase signals at the input and another pair at the output of the multiplier. The scheme uses double edge triggered memory elements, realized according to [51], which reduce the micropipeline interconnections and switching activity. For the final CPA, the completion signal is generated by a speculative completion technique [52], i.e.…”

Section: Vlsi Design Of Self-timed Variable-latency Implementationmentioning

confidence: 99%

Correction to "Design of synchronous and asynchronous variable-latency pipelined multipliers"

Olivieri

2001

IEEE Trans. VLSI Syst.

View full text Add to dashboard Cite

This paper presents a novel variable-latency multiplier architecture, suitable for implementation as a self-timed multiplier core or as a fully synchronous multi-cycle multiplier core. The architecture combines a 2 nd order Booth algorithm with a split carry save array pipelined organization, incorporating multiple row skipping and completion-predicting carry-select final adder. The paper reports the architecture and logic design, CMOS circuit design and performance evaluation. In 0.35 µm CMOS, the expected sustainable cycle time for a 32-bit synchronous implementation is 2.25 ns. Instruction level simulations estimate 54% single-cycle and 46% two-cycle operations in SPEC95 execution. Using the same CMOS process, the 32-bit asynchronous implementation is expected to reach an average 1.76 ns throughput and 3.48 ns latency in SPEC95 execution. I. INTRODUCTION Fast integer multipliers are a key topic in the VLSI design of high-speed microprocessors. Recent results have shown that through a careful full-custom CMOS design a 54x54 bit multiplication in less than 3 ns is possible [21]. However, with commonly available CMOS processes, micro-architectures with 2 ns cycle time are commercially available [28]. As a result, due to the registers' setup and hold times, even a fast 32 bit multiplication may not fit in a single cycle, and the design of pipelined multi-cycle multipliers is a common design choice to avoid the whole microarchitecture be limited by a relatively slow multiplier. Data dependency always puts a limitation to the throughput of pipelined arithmetic units [22], due to idle cycles between consecutive dependent operations. To overcome this, synchronous variable-latency pipelined addition units have recently been proposed in DSP industrial design [30]. A variable latency unit operates as a normal pipelined unit, but for most operands it can complete its operation in a single cycle, thus avoiding idle cycles insertion and improving the average throughtput. A synchronous signal flags in which cycle the operation has completed. A more aggressive implementation of this idea is inherent in asynchronous design, with self-timed units capable of an average response faster than the worst case [6][9][14] [25][29][39][52].

show abstract

Section: Vlsi Design Of Self-timed Variable-latency Implementationmentioning

confidence: 99%

Correction to "Design of synchronous and asynchronous variable-latency pipelined multipliers"

Olivieri

2001

IEEE Trans. VLSI Syst.

View full text Add to dashboard Cite

show abstract

“…There has been a tremendous amount in asynchronous pipelines, starting with the classical micropipeline work by Sutherland [63]. Pipeline control can be implemented using either a two-phase protocol [24,76,1] or a four-phase protocol [19,28,25,27].…”

Section: Datapathmentioning

confidence: 99%

Practical advances in asynchronous design and in asynchronous/synchronous interfaces

Brunvand¹,

Nowick²,

Yun³

Proceedings 1999 Design Automation Conference (Cat. No. 99CH36361)

View full text Add to dashboard Cite

show abstract

“…There have been a lot of works done for asynchronous design [1] including synthesis methods of controllers [2,3,4], and pipeline controller design [5,6,7,8].…”

Section: Introductionmentioning

confidence: 99%

Synthesis of four-phase asynchronous control circuits from pipeline dependency graphs

Kagotani¹,

Okamoto²,

Nanya³

Proceedings of the ASP-DAC 2001. Asia and South Pacific Design Automation Conference 2001 (Cat. No.01EX455)

View full text Add to dashboard Cite

-We propose a method of synthesizing pipeline controllers as four-phase asynchronous circuits from specifications described as two-phase dependency graphs. Pipeline twophase dependency graphs are transformed into four-phase ones by applying a transformation rule to each simple loop in the graphs. Four-phase dependency graphs are easily mapped onto four-phase asynchronous control circuits. We also discuss some simplification of four-phase dependency graphs.

show abstract

High-performance two-phase micropipeline building blocks: double edge-triggered latches and burst-mode select and toggle circuits

Cited by 18 publications

References 7 publications

Correction to "Design of synchronous and asynchronous variable-latency pipelined multipliers"

Correction to "Design of synchronous and asynchronous variable-latency pipelined multipliers"

Practical advances in asynchronous design and in asynchronous/synchronous interfaces

Synthesis of four-phase asynchronous control circuits from pipeline dependency graphs

Contact Info

Product

Resources

About