49th AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition 2011
DOI: 10.2514/6.2011-947
|View full text |Cite
|
Sign up to set email alerts
|

Scalability of Incompressible Flow Computations on Multi-GPU Clusters Using Dual-Level and Tri-Level Parallelism

Abstract: High performance computing using graphics processing units (GPUs) is gaining popularity in the scientific computing field, with many large compute clusters being augmented with multiple GPUs in each node. We investigate hybrid tri-level (MPI-OpenMP-CUDA) parallel implementations to explore the efficiency and scalability of incompressible flow computations on GPU clusters up to 128 GPUS. This work details some of the unique issues faced when merging fine-grain parallelism on the GPU using CUDA with coarse-grain… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
4
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 28 publications
(33 reference statements)
0
4
0
Order By: Relevance
“…CPU (central processing unit) + Intel Xeon-phi co-processors were used for heterogeneous parallel computing employing MPI, OpenMP, and offload programming model. CUDA, OpenMP, and hybrid OpenMP + CUDA-based parallelization for in-house CFD codes is also reported in the literature (Simmendinger and Kügeler 2010;Kafui et al 2011;Xu et al 2014;Jacobsen and Senocak 2011). Review articles by Afzal et al (2017) and Pinto et al (2016Pinto et al ( , 2017 provide a detailed insight into parallel computing strategies for different CFD applications.…”
Section: Introductionmentioning
confidence: 95%
“…CPU (central processing unit) + Intel Xeon-phi co-processors were used for heterogeneous parallel computing employing MPI, OpenMP, and offload programming model. CUDA, OpenMP, and hybrid OpenMP + CUDA-based parallelization for in-house CFD codes is also reported in the literature (Simmendinger and Kügeler 2010;Kafui et al 2011;Xu et al 2014;Jacobsen and Senocak 2011). Review articles by Afzal et al (2017) and Pinto et al (2016Pinto et al ( , 2017 provide a detailed insight into parallel computing strategies for different CFD applications.…”
Section: Introductionmentioning
confidence: 95%
“…In this approach, each MPI process solves the problem on a sub-domain using the GPU it is associated with. Such approaches can be found in Komatitsch et al (2010); Jacobsen and Senocak (2011); Lai et al (2019); Viñas et al (2013); Turchetto et al (2020). This paper describes a methodology for porting a finite volume solver for the SWE on a multi-GPU architecture.…”
Section: Introductionmentioning
confidence: 99%
“…Since appearance, GPU has shown distinctive prospects across a large range of fields in practice, for instance, artificial intelligence, deep learning, molecular dynamics, quantum chemistry, high-energy physics, and likewise, in CFD applications. Researchers have made the technology of extension mature from single to several GPUs and even clusters [6][7][8], including the different speedups between explicit and implicit schemes [9], the variance among structured, unstructured and hybrid grids [10,11], the influence of single and double precision [12], as well as high-order schemes and high-fidelity methods attracting increasing attention [13][14][15][16][17][18]. Contributed by hardware's development, GPU has possessed the power of simulating more complicated problems, such as turbulence, where LES was studies earlier [19,20] but DNS was still in the infancy [21][22][23][24].…”
Section: Introductionmentioning
confidence: 99%