Toward loosely coupled programming on petascale systems

Raicu, Ioan; Zhang, Zhao; Wilde, Mike; Foster, Ian; Beckman, Pete; Iskra, Kamil; Clifford, Ben

doi:10.1109/sc.2008.5219768

Cited by 75 publications

(76 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Although no overhead is detected up to 256 CPUs, there is a maximum overhead of 7% (VEGFR2) at 4096 CPUs, which increases up to 16.3% (PARP) at 8192 CPUs. At 16,384 CPUs, the overhead varies from 27.8% (ACE) up to 34.3% (PARP). The performance of Autodock4.lga.MPI is similar to that of the program Dovis2 using 256 CPUs, 31 as well as the program Dock6.MPI that shows overheads of about 8% at 4096 CPUs, 12% at 8192 CPUs, and 45% at 16,384 CPUs.…”

Section: Scaling On Hpc Architecturementioning

confidence: 99%

“…14- 16 The performance can be significantly affected by time spent by the system in computations that do not contribute to the advancement of any user tasks (i.e., ''overhead''), such as task management, allocation of resources, and input/output (I/O) operations (open/read/write/close). [14][15][16] A task-parallel docking procedure (or high-throughput computing 17 ), i.e., n simultaneous and independent docking jobs running on n CPUs, can be carried out by executing a serial docking program (i.e., one that cannot run on more than one CPU) on each of the n CPUs.…”

Section: Introductionmentioning

confidence: 99%

“…[14][15][16] A task-parallel docking procedure (or high-throughput computing 17 ), i.e., n simultaneous and independent docking jobs running on n CPUs, can be carried out by executing a serial docking program (i.e., one that cannot run on more than one CPU) on each of the n CPUs. Among the popular docking programs used in a task-parallel fashion are the academic software packages Autodock 18 and Dock, [19][20][21] which precalculate aspects of the protein-ligand interactions for a number of points equally spaced on a grid.…”

Section: Introductionmentioning

confidence: 99%

“…14,26 In other work, about 138K serial Dock6 jobs were run on 128K CPUs on Blue Gene using the recently developed task execution framework Falkon, 15 exhibiting 3-5% overhead. 16 As an alternative to starting serial jobs through an external task manager such as PBS, parallelization can be implemented within the docking software. The message passing interface (MPI) specification 27 is commonly used for parallelizing computing applications by providing functions that can be introduced in programs to allow inter-CPU communication.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Task‐parallel message passing interface implementation of Autodock4 for docking of very large databases of compounds using high‐performance super‐computers

et al. 2010

View full text Add to dashboard Cite

A message passing interface (MPI)-based implementation (Autodock4.lga.MPI) of the grid-based docking program Autodock4 has been developed to allow simultaneous and independent docking of multiple compounds on up to thousands of central processing units (CPUs) using the Lamarkian genetic algorithm. The MPI version reads a single binary file containing precalculated grids that represent the protein-ligand interactions, i.e., van der Waals, electrostatic, and desolvation potentials, and needs only two input parameter files for the entire docking run. In comparison, the serial version of Autodock4 reads ASCII grid files and requires one parameter file per compound. The modifications performed result in significantly reduced input/output activity compared with the serial version. Autodock4.lga.MPI scales up to 8192 CPUs with a maximal overhead of 16.3%, of which two thirds is due to input/ output operations and one third originates from MPI operations. The optimal docking strategy, which minimizes docking CPU time without lowering the quality of the database enrichments, comprises the docking of ligands preordered from the most to the least flexible and the assignment of the number of energy evaluations as a function of the number of rotatable bounds. In 24 h, on 8192 high-performance computing CPUs, the present MPI version would allow docking to a rigid protein of about 300K small flexible compounds or 11 million rigid compounds.

show abstract

Section: Scaling On Hpc Architecturementioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Task‐parallel message passing interface implementation of Autodock4 for docking of very large databases of compounds using high‐performance super‐computers

et al. 2010

View full text Add to dashboard Cite

show abstract

“…We view this approach as overly restrictive and potentially harmful in several ways: system reliability is jeopardized by more reboot cycles, diagnosing and monitoring the health of individual nodes is difficult, and the system is less available for use. Management based on virtualization would also make it possible to backfill work on the machine using loosely-coupled programming jobs [26] or other low priority work. A batch-submission or grid computing system could be run on a collection of nodes where a new OS stack could be dynamically launched; this system could also be brought up and torn down as needed.…”

Section: Motivationmentioning

confidence: 99%

Palacios and Kitten : high performance operating systems for scalable virtualized and native supercomputing.

Widener¹,

Jaconette²,

Bridges³

et al. 2009

View full text Add to dashboard Cite

Palacios and Kitten are new open source tools that enable applications, whether ported or not, to achieve scalable high performance on large machines. They provide a thin layer over the hardware to support both full-featured virtualized environments and native code bases. Kitten is an OS under development at Sandia that implements a lightweight kernel architecture to provide predictable behavior and increased flexibility on large machines, while also providing Linux binary compatibility. Palacios is a VMM that is under development at Northwestern University and the University of New Mexico. Palacios, which can be embedded into Kitten and other OSes, supports existing, unmodified applications and operating systems by using virtualization that leverages hardware technologies. We describe the design and implementation of both Kitten and Palacios. Our benchmarks show that they provide near native, scalable performance. Palacios and Kitten provide an incremental path to using supercomputer resources that is not performance-compromised.3 Acknowledgment

show abstract

POGGI: generating puzzle instances for online games on grid infrastructures

Iosup

2010

Concurrency and Computation

View full text Add to dashboard Cite

key words: game content generation, grid computing, puzzle, MMOG, resource management. SUMMARYThe developers of Massively Multiplayer Online Games (MMOGs) rely on attractive content such as logical challenges (puzzles) to entertain and generate revenue from millions of concurrent players. While a large part of the content is currently generated by hand, the exponentially increasing number of players and their new demand for player-customized content have made manual content generation undesirable. Thus, in this work we investigate the problem of automated, player-customized game content generation at MMOG scale; to make our investigation tractable, we focus on puzzle game content generation. In this context, we design POGGI, an architecture for generating player-customized game content using on-demand, grid resources. POGGI focuses on the large-scale generation of puzzle instances that match well the solving ability of the players, and that lead to fresh playing experience. Using our reference implementation of POGGI, we show through experiments in a real resource pool of over 1,600 nodes that POGGI can generate commercial-quality content at MMOG scale. IntroductionMassively Multiplayer Online Games (MMOGs) have emerged in the past decade as a new type of large-scale distributed application: real-time virtual world simulations entertaining at the same time millions of players located around the world. In real deployments, the operation and maintenance of these applications includes two main components, one running the large-scale virtual world simulation, the other populating the virtual world with content that keeps the players engaged (and paying). So far, content has been provided exclusively by human content designers, but the growth of the player population, the lack of scalability of the production pipeline, and the increase in the price ratio between human work and computation make this situation undesirable for the future. Extending our previous work [14], here we investigate the problem of automated, player-customized content generation for MMOGs using grids. * Correspondence to: Email: a.iosup@tudelft.nl. We thank Dr. Dick Epema and Dr. Miron Livny for support. There are many types of content present in MMOGs, from 3D objects populating the virtual worlds, to abstract challenges facing the players. Since content generation is timeconsuming and costly, MMOG developers seek the content types that entertain and challenge players for the longest time. Puzzle games, that is, games in which the player is entertained by solving a logical challenge, are a much sought-after MMOG content type; for instance, players may spend hours on a chess puzzle [17,24] that enables exiting a labyrinth. Today, puzzle game content is generated by teams of human designers, which poses scalability problems to the production pipeline. While several commercial games have made use of content generation [1,5,7,29], they have all been small-scale games in terms of number of players, and did not consider the generation of puzzle game...

show abstract

Toward loosely coupled programming on petascale systems

Cited by 75 publications

References 19 publications

Task‐parallel message passing interface implementation of Autodock4 for docking of very large databases of compounds using high‐performance super‐computers

Task‐parallel message passing interface implementation of Autodock4 for docking of very large databases of compounds using high‐performance super‐computers

Palacios and Kitten : high performance operating systems for scalable virtualized and native supercomputing.

POGGI: generating puzzle instances for online games on grid infrastructures

Contact Info

Product

Resources

About