2005
DOI: 10.1109/mm.2005.49
|View full text |Cite
|
Sign up to set email alerts
|

High-Performance Throughput Computing

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
29
0

Year Published

2006
2006
2014
2014

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 70 publications
(30 citation statements)
references
References 13 publications
1
29
0
Order By: Relevance
“…This estimate is consistent with the current core-size differences in 45 nm for the out-of-order Penryn and the in-order Silverthorne, and is more conservative than the 5x area reduction reported by Asanovic et al [3]. We then assume a multithreading area overhead of 10% as reported in Chaudry et al [7]. Total die area for the processor and L1 die is estimated to be between 423 mm 2 (Penryn based) and 491 mm 2 (Silverthorne based).…”
Section: Coressupporting
confidence: 89%
“…This estimate is consistent with the current core-size differences in 45 nm for the out-of-order Penryn and the in-order Silverthorne, and is more conservative than the 5x area reduction reported by Asanovic et al [3]. We then assume a multithreading area overhead of 10% as reported in Chaudry et al [7]. Total die area for the processor and L1 die is estimated to be between 423 mm 2 (Penryn based) and 491 mm 2 (Silverthorne based).…”
Section: Coressupporting
confidence: 89%
“…Hardware Scouting, described by Chaudhry et al [7], is an extension of runahead execution, that includes several optimizations to previous runahead proposals. In hardware scouting, launching and exiting out of runahead is a zerolatency operation and runahead mode is also entered on low latency misses (L2 hits).…”
Section: Runahead Execution and Hardware Scoutmentioning
confidence: 99%
“…As the memory wall problem has come to overshadow other aspects of processing, various forms of runahead execution have been proposed [21][12] [7][3] [4]. Runahead execution attempts to reduce the effect of the long memory latencies by increasing the memory-level parallelism.…”
Section: Introductionmentioning
confidence: 99%
“…Techniques for reducing the frequency and impact of cache misses include hardware and software prefetching (Chen andBauer 1994, Klaiber andLevy 1991), speculative loads and execution (Rogers at al. 1992) and multithreading (Agarwal 1992;Byrd and Holliday 1995;Ungerer et al 2003, Chaudhry et al 2005, Emer et al 2007.…”
Section: Introductionmentioning
confidence: 99%