2006
DOI: 10.1147/rd.502.0287
|View full text |Cite
|
Sign up to set email alerts
|

Decomposing the load-store queue by function for power reduction and scalability

Abstract: Abstract-Because they are based on large content-addressable memories, load-store queues (LSQs) present implementation challenges in superscalar processors, especially as issue width and number of in-flight instructions are scaled. In this paper, we propose an alternate organization of an LSQ that separates the time-critical forwarding functionality from checking that loads received their correct values. Two main techniques are exploited: 1) the store forwarding logic is only accessed by those loads and stores… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
33
0

Year Published

2007
2007
2012
2012

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 30 publications
(33 citation statements)
references
References 27 publications
0
33
0
Order By: Relevance
“…Proposals can be grouped into three general classes. The first class maintains the age-ordered store queue structure but uses partitioning, filtering, hierarchy, dependence speculation, and speculative forwarding through the primary data cache or other structures to reduce the frequency of associative store queue search or the number of entries examined per search [2,5,12,18,20]. A second class avoids associative search by abandoning the conventional age-ordered structure and replacing it with a cache-like address-indexed structure [6,18,21,24].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Proposals can be grouped into three general classes. The first class maintains the age-ordered store queue structure but uses partitioning, filtering, hierarchy, dependence speculation, and speculative forwarding through the primary data cache or other structures to reduce the frequency of associative store queue search or the number of entries examined per search [2,5,12,18,20]. A second class avoids associative search by abandoning the conventional age-ordered structure and replacing it with a cache-like address-indexed structure [6,18,21,24].…”
Section: Related Workmentioning
confidence: 99%
“…Associative search constrains the scalability of the store queue, which in turn constrains the scalability of the entire instruction window. To address this challenge, recent work has proposed to reduce both search frequency and the number of entries that must be searched [2,5,12,15,18,20], to replace the fully-associative age-indexed store queue with a set-associative address-indexed forwarding structure [6,21,24], or to maintain the age-ordered structure but replace associative search with speculative indexed access [19,22]. This paper presents NoSQ (short for No Store Queue and pronounced like "mosque"), a microarchitecture that implements in-flight store-load communication without a store queue or any other intermediary structure.…”
Section: Introductionmentioning
confidence: 99%
“…This L1 structure is backed up by a much larger second-level (L2) structure to correct/complement the work of the L1 structure. The L1 structure can be allocated according to program order or execution order (within a bank, if banked) for every store [1,8,24] or only allocated to those stores predicted to be involved in forwarding [3,17]. The L2 structure is also used in varying ways due to different focuses.…”
Section: Highlight Of Optimized and Alternative Designsmentioning
confidence: 99%
“…The L2 structure is also used in varying ways due to different focuses. It can be banked to save energy per access [3,17]; it can be filtered to reduce access frequency (and thus energy) [1,19]; or it can be simplified in functionality such as removing the forwarding capability [24].…”
Section: Highlight Of Optimized and Alternative Designsmentioning
confidence: 99%
See 1 more Smart Citation