Proceedings of the 27th Annual International Symposium on Microarchitecture - MICRO 27 1994
DOI: 10.1145/192724.192726
|View full text |Cite
|
Sign up to set email alerts
|

Using branch handling hardware to support profile-driven optimization

Abstract: Profile-based optimizations can be used for instruction scheduling, loop scheduling, data preloading, function in-lining, and instruction cache performance enhancement. However, these techniques have not been embraced by software vendors because programs instrumented for profiling run 2-30 times slower, an awkward compile-run-recomptle sequence is required, and a test input suite must be collected and validated for each program. This paper proposes using existing branch handling hardware to generate profile in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2004
2004
2009
2009

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 30 publications
(12 citation statements)
references
References 17 publications
0
11
0
Order By: Relevance
“…Whether the data was gathered through hardware performance counters [Anderson et al 1997], stratified sampling [Sastry et al 2001], or even potentially in fixed ranges [Zilles and Sohi 2001;Zhou et al 2004], the end result is essentially a list of equivalent items and their counts. While there exists some specialized software and hardware systems that attempt to tightly compress particular types of traces [Anderson et al 1997;Conte et al 1996Conte et al , 1994Narayanasamy et al 2003;Sastry et al 2001;Zilles and Sohi 2001], we believe that we are the first to present a general hardware-based methodology for storing profiles in a hierarchical fashion.…”
Section: Profile Treesmentioning
confidence: 99%
See 2 more Smart Citations
“…Whether the data was gathered through hardware performance counters [Anderson et al 1997], stratified sampling [Sastry et al 2001], or even potentially in fixed ranges [Zilles and Sohi 2001;Zhou et al 2004], the end result is essentially a list of equivalent items and their counts. While there exists some specialized software and hardware systems that attempt to tightly compress particular types of traces [Anderson et al 1997;Conte et al 1996Conte et al , 1994Narayanasamy et al 2003;Sastry et al 2001;Zilles and Sohi 2001], we believe that we are the first to present a general hardware-based methodology for storing profiles in a hierarchical fashion.…”
Section: Profile Treesmentioning
confidence: 99%
“…Several software techniques, such as binary instrumentation [Buck and Hollingsworth 2000;Luk et al 2005;Srivastava et al 2001;Srivastava and Eustace 1994;Bus et al 2004] and sampling [Arnold and Ryder 2001], can be used to generate and analyze this profile information with only a moderate amount of overhead [Arnold and Ryder 2001;Ball and Larus 1996;Calder et al 1997;Chilimbi 2001;Chilimbi and Hirzel 2002;Duesterwald and Bala 2000;Hirzel and Chilimbi 2001;Larus 1999]. Recently, several researchers have proposed various forms of architectural support [Anderson et al 1997;Conte et al 1996Conte et al , 1994Dean et al 1997;Heil and Smith 2000;Narayanasamy et al 2003;Peri et al 1999;Sastry et al 2001;Yang and Gupta 2002;Zilles and Sohi 2001] with the aim of increasing accuracy and further reducing the overhead of software-based techniques. Value profiles can be exploited to perform code specialization [Calder et al 1997], value prediction [Lipasti and Shen 1996;Zhou et al 2003], and value encoding [Yang and Gupta 2002;Yang et al 2000].…”
Section: Profiling With Adaptive Precisionmentioning
confidence: 99%
See 1 more Smart Citation
“…In [9], a technique is described on how to estimate trivially a program's edge execution frequencies by periodically reading the contents of BTB. In [10] a hardware called Profile Buffer is proposed, which counts the number of times a branch is taken and not taken.…”
Section: A Hardware-support Profilingmentioning
confidence: 99%
“…This profile operation is somewhat more complex than standard basic block or instruction count profiles, since a simulation of all the alternative component predictors must be performed. However, this information could easily be gathered using on-line performance tools [5,4] that provide stochastic instruction sampling, or, as done for this paper, using binary instrumentation [15].…”
Section: Static Hybrid Predictorsmentioning
confidence: 99%