Communication and Optimization Aspects of Parallel Programming Models on                Hybrid Architectures

Rabenseifner, Rolf; Wellein, Gerhard

doi:10.1177/1094342003017001005

Cited by 41 publications

(28 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Thus, we consider programs that use the common THREAD MASTERONLY model [13]. Its hierarchical decomposition closely matches most large-scale HPC systems, which are comprised of clustered nodes, each of which has multiple cores per node, distributed across multiple processors.…”

Section: Hybrid Mpi/openmp Terminologymentioning

confidence: 99%

Hybrid MPI/OpenMP power-aware computing

Dong

Supinski

Schulz

et al. 2010

2010 IEEE International Symposium on Parallel &Amp; Distributed Processing (IPDPS)

110

View full text Add to dashboard Cite

Abstract-Power-aware execution of parallel programs is now a primary concern in large-scale HPC environments. Prior research in this area has explored models and algorithms based on dynamic voltage and frequency scaling (DVFS) and dynamic concurrency throttling (DCT) to achieve power-aware execution of programs written in a single programming model, typically MPI or OpenMP. However, hybrid programming models combining MPI and OpenMP are growing in popularity as emerging large-scale systems have many nodes with several processors per node and multiple cores per processor. In this paper we present and evaluate solutions for power-efficient execution of programs written in this hybrid model targeting large-scale distributed systems with multicore nodes. We use a new power-aware performance prediction model of hybrid MPI/OpenMP applications to derive a novel algorithm for power-efficient execution of realistic applications from the ASC Sequoia and NPB MZ benchmarks. Our new algorithm yields substantial energy savings (4.18% on average and up to 13.8%) with either negligible performance loss or performance gain (up to 7.2%).

show abstract

Section: Hybrid Mpi/openmp Terminologymentioning

confidence: 99%

Hybrid MPI/OpenMP power-aware computing

Dong

Supinski

Schulz

et al. 2010

2010 IEEE International Symposium on Parallel &Amp; Distributed Processing (IPDPS)

110

View full text Add to dashboard Cite

show abstract

“…The notion of overlapping communication and computation in various ways has been described before [10,11] but we present here a new way based on the new functionality of the OpenMP tasking model. OpenMP version 3.0 introduces the task directive, which allows the programmer to specify a unit of parallel work called an explicit task, which express unstructured parallelism and defines dynamically generated work units that will be processed by the team [1].…”

Section: The Gts Particle Shifter and How To Fightmentioning

confidence: 99%

“…Which approach -using OpenMP tasking or new MPI non-blocking collectives -performs best remains to be seen once the new MPI 3.0 version is available. Rabenseifner and Wellein [11] point out that the benefit is limited, mainly because the communication time can be hidden by parallelizing it to the numerical threads (which reduces the available threads for numerics by one). Therefore, without parallelizing communication with computation the maximum benefit ratio is (2 − 1/n) on n threads.…”

Section: ! a D D I N G S H I F T E D P A R T I C L E S From L E F T !mentioning

confidence: 99%

Overlapping Communication with Computation Using OpenMP Tasks on the GTS Magnetic Fusion Code

Preissl

Koniges

Ethier

et al. 2010

Scientific Programming

View full text Add to dashboard Cite

Abstract. Application codes in a variety of areas are being updated for performance on the latest architectures. In this paper we examine an application, which comes from magnetic fusion for performance acceleration with a particular emphasis on methods that are applicable for many/multicore and future architectural designs. We take an important magnetic fusion particle code that already includes several levels of parallelism including hybrid MPI combined with OpenMP. We study how to include new advanced hybrid models, which extend the applicability of OpenMP tasks and exploit multi-threaded MPI support to overlap communication and computation. Experiments carried out on Cray XT4 and XT5 machines resulting in a speed-up of up to 35% of the investigated GTS particle shifter kernel show the benefits and applicability of this approach.

show abstract

“…Nevertheless, a lot of important scientific work enlightens the complexity of the many aspects that affect the overall performance of hybrid programs ( [2], [8], [10]). Also, the need for a multi-threading MPI implementation that will efficiently support the hybrid model has been spotted by the research community ( [11], [9]).…”

Section: Introductionmentioning

confidence: 99%

Advanced Hybrid MPI/OpenMP Parallelization Paradigms for Nested Loop Algorithms onto Clusters of SMPs

Drosinos

Koziris

2003

Recent Advances in Parallel Virtual Machine and Message Passing Interface

View full text Add to dashboard Cite

Abstract. The parallelization process of nested-loop algorithms onto popular multi-level parallel architectures, such as clusters of SMPs, is not a trivial issue, since the existence of data dependencies in the algorithm impose severe restrictions on the task decomposition to be applied. In this paper we propose three techniques for the parallelization of such algorithms, namely pure MPI parallelization, fine-grain hybrid MPI/OpenMP parallelization and coarse-grain MPI/OpenMP parallelization. We further apply an advanced hyperplane scheduling scheme that enables pipelined execution and the overlapping of communication with useful computation, thus leading almost to full CPU utilization. We implement the three variations and perform a number of micro-kernel benchmarks to verify the intuition that the hybrid programming model could potentially exploit the characteristics of an SMP cluster more efficiently than the pure messagepassing programming model. We conclude that the overall performance for each model is both application and hardware dependent, and propose some directions for the efficiency improvement of the hybrid model.

show abstract

Communication and Optimization Aspects of Parallel Programming Models on Hybrid Architectures

Cited by 41 publications

References 14 publications

Hybrid MPI/OpenMP power-aware computing

Hybrid MPI/OpenMP power-aware computing

Overlapping Communication with Computation Using OpenMP Tasks on the GTS Magnetic Fusion Code

Advanced Hybrid MPI/OpenMP Parallelization Paradigms for Nested Loop Algorithms onto Clusters of SMPs

Contact Info

Product

Resources

About