Kamil Iskra scite author profile

Developing and tuning computational science applications to run on extreme scale systems are increasingly complicated processes. Challenges such as managing memory access and tuning message-passing behavior are made easier by tools designed specifically to aid in these processes. Tools that can help users better understand the behavior of their application with respect to I/O have not yet reached the level of utility necessary to play a central role in application development and tuning. This deficiency in the tool set means that we have a poor understanding of how specific applications interact with storage. Worse, the community has little knowledge of what sorts of access patterns are common in today's applications, leading to confusion in the storage research community as to the pressing needs of the computational science community. This paper describes the Darshan I/O characterization tool. Darshan is designed to capture an accurate picture of application I/O behavior, including properties such as patterns of access within files, with the minimum possible overhead. This characterization can shed important light on the I/O behavior of applications at extreme scale. Darshan also can enable researchers to gain greater insight into the overall patterns of access exhibited by such applications, helping the storage community to understand how to best serve current computational science applications and better predict the needs of future applications. In this work we demonstrate Darshan's ability to characterize the I/O behavior of four scientific applications and show that it induces negligible overhead for I/O intensive jobs with as many as 65,536 processes.

show abstract

Toward loosely coupled programming on petascale systems

Raicu

Zhang

Wilde

et al. 2008

View full text Add to dashboard Cite

Abstract-We have extended the Falkon execution framework to make loosely coupled petascale systems a practical and useful prog This work studies and measures the perf involved in applying this approach to enable th systems by a broader user community, and w Our work enables the execution of highly para composed of loosely coupled serial jobs with no the respective applications. This approach all potentially far larger-class of applications to l systems, such as the IBM Blue Gene/P sup present the challenges of I/O performance encou this model practical, and show resul microbenchmarks and real applications from economic energy modeling and molecular benchmarks show that we can scale up to 160K with high efficiency, and can achieve sustained thousands of tasks per second.

show abstract

Scalable I/O forwarding framework for high-performance computing systems

et al. 2009

View full text Add to dashboard Cite

The distributed ASCI Supercomputer project

Bal

Bhoedjang

Hofman

et al. 2000

SIGOPS Oper. Syst. Rev.

View full text Add to dashboard Cite

The Distributed ASCI Supercomputer (DAS) is a homogeneous wide-area distributed system consisting of four cluster computers at different locations. DAS has been used for research on communication software, parallel languages and programming systems, schedulers, parallel applications, and distributed applications. The paper gives a preview of the most interesting research results obtained so far in the DAS project.

show abstract

The Influence of Operating Systems on the Performance of Collective Operations at Extreme Scale

Beckman

Iskra

Yoshii

et al. 2006

View full text Add to dashboard Cite

Parallel Scripting for Applications at the Petascale and Beyond

et al. 2009

View full text Add to dashboard Cite

Benchmarking the effects of operating system interference on extreme-scale parallel machines

et al. 2008

View full text Add to dashboard Cite

We investigate operating system noise, which we identify as one of the main reasons for a lack of synchronicity in parallel applications. Using a microbenchmark, we measure the noise on several contemporary platforms and find that, even with a general-purpose operating system, noise can be limited if certain precautions are taken. We then inject artificially generated noise into a massively parallel system and measure its influence on the performance of collective operations. Our experiments indicate that on extreme-scale platforms, the performance is correlated with the largest interruption to the application, even if the probability of such an interruption on a single process is extremely small. We demonstrate that synchronizing the noise can significantly reduce its negative influence.

show abstract

Accelerating I/O Forwarding in IBM Blue Gene/P Systems

Vishwanath

Hereld

Iskra

et al. 2010

View full text Add to dashboard Cite

Current leadership-class machines suffer from a significant imbalance between their computational power and their I/O bandwidth. I/O forwarding is a paradigm that attempts to bridge the increasing performance and scalability gap between the compute and I/O components of leadership-class machines to meet the requirements of data-intensive applications by shipping I/O calls from compute nodes to dedicated I/O nodes. I/O forwarding is a critical component of the I/O subsystem of the IBM Blue Gene/P supercomputer currently deployed at several leadership computing facilities. In this paper, we evaluate the performance of the existing I/O forwarding mechanisms for BG/P and identify the performance bottlenecks in the current design. We augment the I/O forwarding with two approaches: I/O scheduling using a work-queue model and asynchronous data staging. We evaluate the efficacy of our approaches using microbenchmarks and application-level benchmarks on leadershipclass systems. I. INTRODUCTIONLeadership-class systems are providing unprecedented opportunities to advance science in numerous fields, such as climate sciences, biosciences, astrophysics, computational chemistry, materials sciences, high-energy physics, and nuclear physics [17]. Current leadership-class machines such as the IBM Blue Gene/P (BG/P) supercomputer at the Argonne National Laboratory and the Cray XT system at the Oak Ridge National Laboratory consist of a few hundred thousand processing elements. BG/P is the second generation of supercomputers in the Blue Gene series and has demonstrated ultrascale performance together with a novel energy-efficient design. As of November 2009, five of the top 20 systems in the Top 500 list [20] and thirteen of the top 20 most powerefficient systems were based on the Blue Gene solution [8].While the computational power of supercomputers keeps increasing with every generation, the I/O systems have not kept pace, resulting in a significant performance bottleneck. In order to achieve higher performance, many HPC systems run a stripped-down operating system kernel on the compute nodes to reduce the operating system "noise." The IBM Blue Gene series of supercomputers takes this a step further, restricting I/O operations from the compute nodes. In order to enable applications to perform I/O, the compute node kernel ships all I/O operations to a dedicated I/O node, which performs I/O on behalf of the compute nodes. This process is known as I/O forwarding [3].

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kamil Iskra

24/7 Characterization of petascale I/O workloads

Toward loosely coupled programming on petascale systems

Scalable I/O forwarding framework for high-performance computing systems

The distributed ASCI Supercomputer project

The Influence of Operating Systems on the Performance of Collective Operations at Extreme Scale

Parallel Scripting for Applications at the Petascale and Beyond

Benchmarking the effects of operating system interference on extreme-scale parallel machines

Accelerating I/O Forwarding in IBM Blue Gene/P Systems

Contact Info

Product

Resources

About