Practical CFD Simulations on Programmable Graphics Hardware using SMAC†

Scheidegger, Carlos; Comba, João Luiz Dihl; Cunha, Rudnei Dias da

doi:10.1111/j.1467-8659.2005.00897.x

Cited by 27 publications

(21 citation statements)

References 17 publications

(27 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Scheidegger et al were able to show that NVIDIA GPUs based on the NV35 and NV40 architectures were well suited for CFD 5 simulations using the SMAC (Simplified Marker and Cell) method. With sufficiently dense grids, they were able to achieve speedup factors as high as 21 12 . Similarly, Hagen et al achieved speedups as high as 20 using NVIDIA GeForce GPUs.…”

Section: Introductionmentioning

confidence: 95%

“…For example, if a threedimensional array x(i,j,n) were created in Fortran, the data in the first ('i') column would be ordered 12 sequentially in system memory. That is, the data stored at x(2,1,1) would be directly adjacent in memory to the data stored at x(1,1,1).…”

Section: B General Programming Strategymentioning

confidence: 99%

“…Research has shown that GPUs can reliably produce an order of magnitude speedup, when measured against comparable high performance CPUs, for a wide range of scientific applications [5][6][7][8][9][10][11][12][13][14][15] . Bolz et al…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Multidisciplinary Simulation Acceleration using Multiple Shared-Memory Graphical Processing Units

Kemal

Davis

Owens

2015

AIAA Infotech @ Aerospace

View full text Add to dashboard Cite

show abstract

Section: Introductionmentioning

confidence: 95%

Section: B General Programming Strategymentioning

confidence: 99%

See 1 more Smart Citation

Multidisciplinary Simulation Acceleration using Multiple Shared-Memory Graphical Processing Units

Kemal

Davis

Owens

2015

AIAA Infotech @ Aerospace

View full text Add to dashboard Cite

show abstract

“…In fact, just prior to its first release, Owens et al 9 comprehensively surveyed the field of general-purpose computation on graphics hardware (GPGPU), which included a number of primarily structured grid based solvers, such as those of Harris, 10 Scheidegger et al, 11 and Hagen et al 12 However, the architecture has changed substantially and many of the limitations of GPGPU via traditional graphics APIs such as OpenGL are no longer an issue.…”

Section: Introductionmentioning

confidence: 99%

Running Unstructured Grid Based CFD Solvers on Modern Graphics Hardware

Corrigan

Camelli

Löhner

et al. 2009

19th AIAA Computational Fluid Dynamics

View full text Add to dashboard Cite

Techniques used to implement an unstructured grid solver on modern graphics hardware are described. The three-dimensional Euler equations for inviscid, compressible flow are considered. Effective memory bandwidth is improved by reducing total global memory access and overlapping redundant computation, as well as using an appropriate numbering scheme and data layout. The applicability of per-block shared memory is also considered. The performance of the solver is demonstrated on two benchmark cases: a missile and the NACA0012 wing. For a variety of mesh sizes, an average speed-up factor of roughly 9.5x is observed over the equivalent parallelized OpenMP-code running on a quad-core CPU, and roughly 33x over the equivalent code running in serial.

show abstract

“…Many of these researchers implement fluid simulations, however, most are limited to the 2D incompressible Navier-Stokes equations using pressure projection methods [5][6][7][8][9] , which are interesting from a performance and simulation point of view, but use simplified numerics which lack the proper accuracy required for engineering design applications. Furthermore, negelecting the effects of compressibility leads to much simpler formulations which are not applicable to transonic and supersonic flows of interest.…”

Section: A Previous Workmentioning

confidence: 99%

Unsteady Turbulent Simulations on a Cluster of Graphics Processors

Phillips

Davis

Owens

2010

40th Fluid Dynamics Conference and Exhibit

View full text Add to dashboard Cite

This paper describes the GPU accelerated MBFLO2 multi-block turbulent flow solver completely in double precision using CUDA and the latest generation of GPU processors. On a cluster of 8 Tesla C2050 "Fermi" GPUs and Intel Xeon X5550 "Nehalem" quad-core CPUs, we achieve 9x speedup over the parallel CPU solver or 70x speedup over the serial solver. High performance is obtained by optimizing the data layout on the GPU, optimizing data transfers and using asynchronous memory copies to overlap GPU execution with communications. We test the solver on a turbulent flat plate and an unsteady turbulent cylinder with 3.2 million grid points. We confirm the GPU results are in agreement with turbulent flow theory. We discuss the GPU optimization techniques used to reach this level of performance. Nomenclature E = total energy g = acceleration due to gravity H = total enthalpy h = static enthalpy I = rothalpy k = turbulent kinetic energy p = pressure Pr = Prandtl number Pr t = turbulent Prandtl number R = radius from specified axis of rotation S ij = mean strain-rate tensor u = axial velocity component û = internal energy v = tangential velocity component V = velocity magnitude z = elevation = coefficient of viscosity = turbulent coefficient of viscosity = turbulent dissipation rate divided by turbulent kinetic energy = rotational velocity about specified axis of rotation (rads/s) ij = shear stress tensor = density

show abstract

Practical CFD Simulations on Programmable Graphics Hardware using SMAC†

Cited by 27 publications

References 17 publications

Multidisciplinary Simulation Acceleration using Multiple Shared-Memory Graphical Processing Units

Multidisciplinary Simulation Acceleration using Multiple Shared-Memory Graphical Processing Units

Running Unstructured Grid Based CFD Solvers on Modern Graphics Hardware

Unsteady Turbulent Simulations on a Cluster of Graphics Processors

Contact Info

Product

Resources

About