2002
DOI: 10.1145/545214.545223
|View full text |Cite
|
Sign up to set email alerts
|

A large, fast instruction window for tolerating cache misses

Abstract: Instruction window size is an important design parameter for man), modern processors. Large instruction windows offer the potential advantage of exposing.~large amounts of instruction level parallelism. Unfortunately/. naively scaling conventional window designs can significantly degrade clock cycle time, undermining the benefits of increased parallelism.This paper presents a new instruction:window design targeted at achieving the latency tolerance'of:large windows with the clock cycle time of small windows: T… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
102
0

Year Published

2004
2004
2016
2016

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 109 publications
(104 citation statements)
references
References 40 publications
2
102
0
Order By: Relevance
“…However, several studies that do not need a multithreaded environment have been carried out more recently [20]- [24]. These studies commonly pre-execute instructions when the instruction window resources run out.…”
Section: Studies Considering Power On Mlp Exploitationmentioning
confidence: 99%
“…However, several studies that do not need a multithreaded environment have been carried out more recently [20]- [24]. These studies commonly pre-execute instructions when the instruction window resources run out.…”
Section: Studies Considering Power On Mlp Exploitationmentioning
confidence: 99%
“…Lebeck et al proposed a scheme that efficiently uses the IQ by moving a load causing an LLC miss and the instructions that depend directly or indirectly on it to a special buffer, called the WIB (waiting instruction buffer) [26]. MLP can be exploited in a small IQ.…”
Section: Exploiting Mlpmentioning
confidence: 99%
“…Work in this field has concentrated on the reorder buffer [7,1], the instruction queues [7,16,30], on handling registers [19,6] and on the load/store queue. Load queue resource requirements can be greatly reduced by using techniques for early release of load instructions [8,18].…”
Section: Related Workmentioning
confidence: 99%
“…Otherwise they have low execution locality. This observation enables the construction of large-window processors requiring only moderately-sized structures by focusing only on the execution of high locality code [7,16,30,23].…”
Section: Recent Trends In Ilp Processorsmentioning
confidence: 99%