Michael O. McCracken scite author profile

Michael O. McCracken

4Publications

173Citation Statements Received

61Citation Statements Given

How they've been cited

274

171

How they cite others

Affiliations

Oracle (United States), University of California, San Diego, San Diego Supercomputer Center

Publications

Order By: Most citations

Quantifying Locality In The Memory Access Patterns of HPC Applications

Weinberg

McCracken

Strohmaier

et al.

View full text Add to dashboard Cite

Several benchmarks for measuring the memory performance of HPC systems along dimensions of spatial and temporal memory locality have recently been proposed. However, little is understood about the relationships of these benchmarks to real applications and to each other. We propose a methodology for producing architecture-neutral characterizations of the spatial and temporal locality exhibited by the memory access patterns of applications. We demonstrate that the results track intuitive notions of locality on several synthetic and application benchmarks. We employ the methodology to analyze the memory performance components of the HPC Challenge Benchmarks, the Apex-MAP benchmark, and their relationships to each other and other benchmarks and applications. We show that this analysis can be used to both increase understanding of the benchmarks and enhance their usefulness by mapping them, along with applications, to a 2-D space along axes of spatial and temporal locality.

show abstract

Statistical scalability analysis of communication operations in distributed applications

Vetter

McCracken

2001

SIGPLAN Not.

View full text Add to dashboard Cite

show abstract

Statistical scalability analysis of communication operations in distributed applications

Vetter

McCracken

2001

View full text Add to dashboard Cite

Current trends in high performance computing suggest that users will soon have widespread access to clusters of multiprocessors with hundreds, if not thousands, of processors. This unprecedented degree of parallelism will undoubtedly expose scalability limitations in existing applications, where scalability is the ability of a parallel algorithm on a parallel architecture to effectively utilize an increasing number of processors. Users will need precise and automated techniques for detecting the cause of limited scalability. This paper addresses this dilemma. First, we argue that users face numerous challenges in understanding application scalability: managing substantial amounts of experiment data, extracting useful trends from this data, and reconciling performance information with their application's design. Second, we propose a solution to automate this data analysis problem by applying fundamental statistical techniques to scalability experiment data. Finally, we evaluate our operational prototype on several applications, and show that statistical techniques offer an effective strategy for assessing application scalability. In particular, we find that non-parametric correlation of the number of tasks to the ratio of the time for communication operations to overall communication time provides a reliable measure for identifying communication operations that scale poorly. 10, 21]; however, many of these techniques have not been extended to help users understand application scalability. Essentially, this previous work has focused on helping users understand the performance data of one application experiment, , ,where T i,op provides the total time that task i spent in the op type of communication operation. Figure 2 shows the breakdown of communication operations by type for BT. Viewed in this light, we can focus our scalability analysis on one type of communication operation in BT: Wait, Barrier, Waitall, and Comm_split. Up to 81 tasks, T agg,wait remains the largest component, but relatively constant; yet beyond 81 tasks, T agg,wait dominates the communication time. Also, Barrier, Waitall, and Comm_split emerge noticeably from the group as the number of tasks increase. Realistically, however, applications call these operations from multiple call sites. So, as a final step, we split T agg,wait into individual call sites. Recasting T i,wait as T i,wait,callsite where

show abstract

WRF nature run

Michalakes

Hacker

Loft

et al. 2007

View full text Add to dashboard Cite

The Weather Research and Forecast (WRF) model is a limitedarea model of the atmosphere for mesoscale research and operational numerical weather prediction (NWP). A petascale problem is a WRF nature run that provides very high-resolution "truth" against which more coarse simulations or perturbation runs may be compared for purposes of studying predictability, stochastic parameterization, and fundamental dynamics. We carried out a nature run involving an idealized high resolution rotating fluid on the hemisphere to investigate scales that span the k-3 to k-5/3 kinetic energy spectral transition of the observed atmosphere using 65,536 processors of the BG/L machine at LLNL. We worked through issues of parallel I/O and scalability. The primary result is not just the scalability and high Tflops number, but an important step towards understanding weather predictability at high resolution.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.