1996
DOI: 10.1007/bf03356747
|View full text |Cite
|
Sign up to set email alerts
|

Hardware-Based Profiling: An Effective Technique for Profile-Driven Optimization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
22
0

Year Published

2008
2008
2022
2022

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 26 publications
(22 citation statements)
references
References 10 publications
0
22
0
Order By: Relevance
“…Coscheduling could also benefit from per-thread utilization metrics for each shared resource. Others call for cache-line monitors to measure locality and contention in caches, buses, and NUMA systems [33][34][35], using bits for memory state checking [36], using branch predictor history for path profiles [37]. Eyerman et al propose a fundamentally different HPM architecture to collect more meaningful CPI stacks for OoO machines.…”
Section: Issues Facing Hpmmentioning
confidence: 99%
See 1 more Smart Citation
“…Coscheduling could also benefit from per-thread utilization metrics for each shared resource. Others call for cache-line monitors to measure locality and contention in caches, buses, and NUMA systems [33][34][35], using bits for memory state checking [36], using branch predictor history for path profiles [37]. Eyerman et al propose a fundamentally different HPM architecture to collect more meaningful CPI stacks for OoO machines.…”
Section: Issues Facing Hpmmentioning
confidence: 99%
“…A significant amount of recent research is devoted solely to collecting unbiased edge, path, and call stack profiles [31,37,46,47], and one such paper even won the best paper award at PLDI 2009 [2]. This ostensibly simple feature should have long ago become a commodity, programmable and accessible from user space.…”
Section: A Pragmatic Propositionmentioning
confidence: 99%
“…For example, the Morph system [22] collects profiles via statistical sampling of the program counter on clock interrupts. Alternatively, Conte et al proposed sampling the contents of the branch-prediction hardware using kernel-mode instructions to infer an edge profile [5]. In particular, the tags and target addresses stored in the branch target buffer (BTB) serve to identify an arc in an application, and the branch history stored by the branch predictor can be used to estimate each edge's weight.…”
Section: Related Workmentioning
confidence: 99%
“…Other profiling approaches rely on hardware integrated within the microprocessor to assist software developers in profiling an executing program [7][23] [24] [28]. Such hardwareassisted profiling approaches utilize event counters or branch execution statistics to identify application hotspots [7] or frequently executed execution paths [23].…”
Section: Previous Workmentioning
confidence: 99%
“…Such hardwareassisted profiling approaches utilize event counters or branch execution statistics to identify application hotspots [7] or frequently executed execution paths [23]. Although these hardware-assisted profiling approaches may incur lower overheads compared to software-based profiling methods, the runtimes overheads cannot be ignored and incur similar ramifications.…”
Section: Previous Workmentioning
confidence: 99%