Dana S. Henry scite author profile

Our program benchmarks and simulations of novel circuits indicate that large-window processors are feasible. Using our redesigned superscalar components, a large-window processor implemented in today's technology can achieve an increase of 10-60% (geometric mean of 31%) in program speed compared to today's processors. The processor operates at clock speeds comparable to today's processors, but achieves significantly higher ILP.To measure the impact of a large window on clock speed, we design and simulate new implementations of the logic components that most limit the critical path of our large-window processor: the schedule logic and the wake-up logic. We use log-depth cyclic segmented prefix (CSP) circuits to reimplement these components. Our layouts and simulations of critical paths through these circuits indicate that our large-window processor could be clocked at frequencies exceeding 500MHz in today's technology. Our commit logic and rename logic can also run at these speeds.To measure the impact of a large window on ILP, we compare two microarchitectures, the first has a 128-instruction window, an 8-wide fetch unit, and 20-wide issue (four integer, branch, multiply, float, and memory units), whereas the second has a 32-instruction window, and a 4-wide fetch unit and is comparable to today's processors. For each, we simulate different window reuse and bypass policies. Our simulations show that the large-window processor achieves significantly higher IPC. This performance increase comes despite the fact that the large-window processor uses a wrap-around window while the small-window processor uses a compressing window, thus effectively increasing its number of outstanding instructions. Furthermore, the large-window processor sometimes pays an extra clock cycle for bypassing.

show abstract

A tightly-coupled processor-network interface

Henry¹,

Joerg²

1992

SIGPLAN Not.

View full text Add to dashboard Cite

A tightly-coupled processor-network interface

Henry¹,

Joerg²

1992

View full text Add to dashboard Cite

The Ultrascalar processor-an asymptotically scalable superscalar microarchitecture

Henry

Kuszmaul

Viswanath

1999

View full text Add to dashboard Cite

The poor scalability of existing superscalar processors has been of great concern to the computer engineering community. In particular, the critical-path lengths of many components in existing implementations grow as n 2 where n is the fetch width, the issue width, or the window size. This paper presents a novel implementation, called the Ultrascalar processor, that dramatically reduces the asymptotic critical-path length of a superscalar processor. The processor is implemented by a large collection of ALUs with controllers (together called execution stations) connected together by a network of parallel-prefix tree circuits. A fat-tree network connects an interleaved cache to the execution stations. These networks provide the full functionality of superscalar processors including renaming, out-of-order execution, and speculative execution. The Ultrascalar's criticalpath length due to gate delays is gates = log n. The wire delays and chip size depend on the provided memory bandwidth and the layout. If the provided memory bandwidth is Mn memory operations per clock cycle then, using an H-tree VLSI layout, the critical-path length due to wire delay (speed-of-light delay) is wires = 8 : n 1=2 if Mn is On 1=2, for 0, [optimal] n 1=2 log n if Mn is n 1=2 , and [near optimal]Mn if Mn is n 1=2+ for 0, [optimal] (with M suitably constrained.) The area is the square of the wire delay.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Dana S. Henry

Predicting conditional branches with fusion-based hybrid predictors

Circuits for wide-window superscalar processors

A tightly-coupled processor-network interface

A tightly-coupled processor-network interface

The Ultrascalar processor-an asymptotically scalable superscalar microarchitecture

Contact Info

Product

Resources

About