2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) 2020
DOI: 10.1109/micro50266.2020.00024
|View full text |Cite
|
Sign up to set email alerts
|

I-SPY: Context-Driven Conditional Instruction Prefetching with Coalescing

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
8

Relationship

2
6

Authors

Journals

citations
Cited by 23 publications
(11 citation statements)
references
References 48 publications
0
5
0
Order By: Relevance
“…The analysis and slice extraction step takes on the order of 100 seconds. Previous works including Aut-oFDO [22], BOLT [88], AsmDB [7], I-SPY [59], Ripple [60], and Twig [58] have shown that profile-guided optimization techniques are practical and that they are deployed in data centers today.…”
Section: Methodsmentioning
confidence: 99%
“…The analysis and slice extraction step takes on the order of 100 seconds. Previous works including Aut-oFDO [22], BOLT [88], AsmDB [7], I-SPY [59], Ripple [60], and Twig [58] have shown that profile-guided optimization techniques are practical and that they are deployed in data centers today.…”
Section: Methodsmentioning
confidence: 99%
“…Prior work from Google and Facebook shows that their widely-deployed data center applications lose more than 15% of all pipeline slots due to frontend stalls [25,27,67,133]. As these applications are proprietary, we use the applications used by prior work [75,77,78,86,100,138,150], where frontend stalls are similarly frequent (more than 15%) due to large instruction footprints. These applications include cassandra [2], kafka [3], and tomcat [4] from the Java DaCapo benchmark suite [31], drupal [142], wordpress [144], and mediawiki [143] from Facebook's OSS -performance benchmark suite [16], finagle-chirper and finagle-http [12] from the Java Renaissance benchmark suite [114], clang [6] while building LLVM [85], PostgreSQL [10] while serving pgbench [9] queries, Python [14] while running the pyperformance [11] benchmark suite, MySQL [146] while serving TPC-C queries [35], and verilator [13] while emulating the Rocket Chip [7].…”
Section: Experimental Methodologymentioning
confidence: 99%
“…Moreover, they lose all the information about a branch every time the corresponding BTB entry is evicted. Since large working set sizes (both instruction and branch footprint) are the key characteristics of data center applications [27,67,75,77,78,132], it is necessary to retain branch reuse behavior even when the corresponding entry is not present in the BTB.…”
Section: Why Do Prior Replacement Policies Fall Short?mentioning
confidence: 99%
See 1 more Smart Citation
“…Litz et al [26] present critical slice prefetching (CRISP) to prefetch hard-to-predict loads. Jamilan et al [19] and I-SPY [22] rely on Intel's LBR and dynamic execution information to optimize data and instruction prefetching, respectively. Ay-ers et al [4] study SPEC and Google workloads and present an automated way of classifying memory access patterns for software-based prefetching.…”
Section: Related Workmentioning
confidence: 99%