2008
DOI: 10.1109/ipdps.2008.4536243
|View full text |Cite
|
Sign up to set email alerts
|

A plug-and-play model for evaluating wavefront computations on parallel architectures

Abstract: This paper develops a plug-and-play reusable LogGP model that can be used to predict the runtime and scaling behavior of different MPI-based pipelined wavefront applications running on modern parallel platforms with multicore nodes. A key new feature of the model is that it requires only a few simple input parameters to project performance for wavefront codes with different structure to the sweeps in each iteration as well as different behavior during each wavefront computation and/or between iterations. We ap… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
42
0

Year Published

2010
2010
2019
2019

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 28 publications
(42 citation statements)
references
References 9 publications
0
42
0
Order By: Relevance
“…One particular example is that several application models assume that broadcast or allreduce scale with Θ(S log(P )) (e.g., [3,17]) while, as demonstrated in Section 4, a good MPI implementation would implement a broadcast or allreduce with Θ(S + log(P )) [5,13,21]. Generally speaking, performance models for middleware libraries such as MPI depend on the parameters of the network (e.g., bandwidth, latency, topology, routing) and the implemented algorithms (e.g., collective algorithms, eager and rendezvous protocols) and are thus hard to generalize.…”
Section: Motivationmentioning
confidence: 99%
“…One particular example is that several application models assume that broadcast or allreduce scale with Θ(S log(P )) (e.g., [3,17]) while, as demonstrated in Section 4, a good MPI implementation would implement a broadcast or allreduce with Θ(S + log(P )) [5,13,21]. Generally speaking, performance models for middleware libraries such as MPI depend on the parameters of the network (e.g., bandwidth, latency, topology, routing) and the implemented algorithms (e.g., collective algorithms, eager and rendezvous protocols) and are thus hard to generalize.…”
Section: Motivationmentioning
confidence: 99%
“…An iteration of Hydra employs several parallel functions, each of which consists of a number of the above operations. An aim of this work is to capture the time to solution by modelling the critical-path run-time of the code as demonstrated in previous analytic modelling research (Mudalige et al, 2008). We develop a general analytic model for the first two key operations (local computation, near-neighbour communication) before applying these to specific segments of the Hydra code.…”
Section: A Predictive Model For Hydramentioning
confidence: 99%
“…From the number of inter-and intra-node connections (equations 11 and 12), we can derive a model for near-neighbour communications in a similar fashion to that found in Mudalige et al (2008). This makes the assumption that the communication network is full-duplex and that the time for two nodes to perform a non-blocking send and receive is equivalent to the time for a single blocking send and receive because of Hydra's use of MPI_Waitall.…”
Section: Near Neighbour Point-to-pointmentioning
confidence: 99%
See 1 more Smart Citation
“…Many HPC centres are therefore turning to alternative tools and methodologies (e.g. predictive performance modelling [1], [2], hardware simulation [3], [4] and mini-applications [5], [6]) to facilitate system evaluation, to aid in the comparison of multiple candidate machines, to investigate optimisation strategies, and to act as a vehicle for porting codes to novel architectures.…”
mentioning
confidence: 99%