Erez Perelman scite author profile

Understanding program behavior is at the foundation of computer architecture and program optimization. Many programs have wildly different behavior on even the largest of scales (that is, over the program's complete execution). During one part of the execution, a program can be completely memory bound; in another, it can repeatedly stall on branch mispredicts. Average statistics gathered about a program might not accurately picture where the real problems lie. This realization has ramifications for many architecture and compiler techniques, from how to best schedule threads on a multithreaded machine, to feedback-directed optimizations, power management, and the simulation and test of architectures. Taking advantage of time-varying behavior requires a set of automated analytic tools and hardware techniques that can discover similarities and changes in program behavior on the largest of time scales.The challenge in building such tools is that during a program's lifetime it can execute billions or trillions of instructions. How can high-level behavior be extracted from this sea of instructions?The reality is this: The way a program's execution changes over time is not totally random; in fact, it often falls into repeating behaviors, called phases. Automatically identifying this phase behavior is the goal of our research and key to unlocking many new optimizations. We define a phase as a set of intervals (or slices in time) within a program's execution that have similar behavior, regardless of temporal adjacency. Recent research has shown that it is indeed possible to accurately identify and predict these phases in program behavior to capture meaningful phase behavior. 1-8The key observation for phase recognition is that any program metric is a direct function of the way a program traverses the code during execution. We can find this phase behavior and classify it by examining only the ratios in which different regions of code are being executed over time. We can simply and quickly collect this information using basic block vector profiles for off-line classification 4,6 or through dynamic branch profiling for online classification. 7 In addition, accurately capturing phase behavior through the computation of a single metric, independent of the underlying architectural details, means that it is pos-

show abstract

Using SimPoint for accurate and efficient simulation

Perelman

Hamerly

Biesbrouck

et al. 2003

SIGMETRICS Perform. Eval. Rev.

136

View full text Add to dashboard Cite

Modern architecture research relies heavily on detailed pipeline simulation. Simulating KeywordsSimPoint, Clustering, Simulation, Fast-forwarding, Sampling SIMPOINTUnderstanding the cycle level behavior of a processor running an application is crucial to modern computer architecture research. To gain this understanding, detailed cycle level simulators are typically employed. Unfortunately, this level of detail comes at the cost of speed, and simulating the full execution of an industry standard benchmark on even the fastest simulator can take weeks to months to complete. This fact has not gone unnoticed in the academic community, and several researchers have started to develop techniques aimed at reducing simulation time.For architecture research it is often necessary to take one instance of a program with a given input, and simulate its performance over many different architecture configurations. The same program binary with the input may be run hundreds or thousands of times to examine how, for example, the effectiveness of a given architecture changes with its size. Our goal in creating SimPoint [1, 2] is to (1) significantly reduce simulation time, (2) provide an accurate characterization of the full program, and (3) to perform the analysis to accomplish the first two goals in a matter of minutes. These goals are met by simulating only a handful of intelligently chosen sections of the full program. When these sections (simulation points) are carefully chosen, it provides an accurate picture of the complete execution of the program and results in highly accurate estimations of performanceThe key to our approach is that for a given program and input, the simulation points only need to be chosen once. This is because we select them using a method that is independent of any particular architecture configuration. The simulation points are selected using a metric that is only based on the code that is executed over time for a program/input pair. Once the sim- ulation points are chosen they can be used for the hundreds or thousands of independent simulations that may be needed, significantly reducing simulation time.To pick the simulation points in [1, 2], we introduce the concept of profiling Basic Block Vectors (BBV) as a way of capturing the important behaviors of the program over time [1]. A Basic Block Vector captures the relative frequency of the code blocks executed during a given portion of execution. After profiling a program with a particular input, we compare the basic block vectors to see how similar they are to one another. Intervals of execution that execute the same code blocks with the same frequency are grouped together into clusters using clustering algorithms from machine learning. We found that sections of execution (represented by basic block vectors) that are grouped into the same cluster have very similar behavior across all the architecture metrics we have examined. Once we break the program into clusters, we pick a single point from each cluster (appropriately weighted) to serve ...

show abstract

Detecting phases in parallel applications on shared memory architectures

Perelman¹,

Polito²,

Bouguet³

et al. 2006

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Erez Perelman

Basic block distribution analysis to find periodic behavior and simulation points in applications

Picking statistically valid and early simulation points

Discovering and exploiting program phases

Using SimPoint for accurate and efficient simulation

Detecting phases in parallel applications on shared memory architectures

Contact Info

Product

Resources

About