Acceleration of the IMplicit–EXplicit nonhydrostatic unified model of the atmosphere on manycore processors

Abdi, Daniel S.; Giraldo, Francis X.; Constantinescu, Emil M.; Carr, Lester E.; Wilcox, Lucas C.; Warburton, Tim

doi:10.1177/1094342017732395

Cited by 27 publications

(28 citation statements)

References 43 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Over the last two decades or so, the spectral element (SE) method has been considered as a numerical method for the fluid flow solver in global weather/climate models (Baer et al, ; Choi & Hong, ; Fournier et al, ; Giraldo et al, ; Kelly & Giraldo, ). The main motivations were the SE methods' near‐perfect scalability (Dennis et al, ), GPU (Graphics Processing Unit) acceleration (e.g., Abdi, Giraldo, et al, ; Abdi, Wilcox, et al, ), high‐order accuracy for smooth problems, and mesh refinement capabilities. For some time the Community Earth System Model (CESM; Hurrell et al, ) has supported a SE dynamical core option in the atmosphere component CAM (Community Atmosphere Model; Neale et al, ) discretized on a cubed‐sphere grid (Figure a).…”

Section: Introductionmentioning

confidence: 99%

NCAR Release of CAM‐SE in CESM2.0: A Reformulation of the Spectral Element Dynamical Core in Dry‐Mass Vertical Coordinates With Comprehensive Treatment of Condensates and Energy

Lauritzen

Nair

Herrington

et al. 2018

J Adv Model Earth Syst

118

152

View full text Add to dashboard Cite

It is the purpose of this paper to provide a comprehensive documentation of the new NCAR (National Center for Atmospheric Research) version of the spectral element (SE) dynamical core as part of the Community Earth System Model (CESM2.0) release. This version differs from previous releases of the SE dynamical core in several ways. Most notably the hybrid sigma vertical coordinate is based on dry air mass, the condensates are dynamically active in the thermodynamic and momentum equations (also referred to as condensate loading), and the continuous equations of motion conserve a more comprehensive total energy that includes condensates. Not related to the vertical coordinate change, the hyperviscosity operators and the vertical remapping algorithms have been modified. The code base has been significantly reduced, sped up, and cleaned up as part of integrating SE as a dynamical core in the CAM (Community Atmosphere Model) repository rather than importing the SE dynamical core from High‐Order Methods Modeling environment as an external code.

show abstract

Section: Introductionmentioning

confidence: 99%

NCAR Release of CAM‐SE in CESM2.0: A Reformulation of the Spectral Element Dynamical Core in Dry‐Mass Vertical Coordinates With Comprehensive Treatment of Condensates and Energy

Lauritzen

Nair

Herrington

et al. 2018

J Adv Model Earth Syst

118

152

View full text Add to dashboard Cite

show abstract

“…The second assessment concerns the tests we conducted with OCCA kernels in the NUMA code on fully 1D and 3D Implicit-Explicit time-integrated dynamics. We describe in detail these results in the RESULTS section and present them in more detail in Abdi, Giraldo, Constantinescu, Carr III, Wilcox, and Warburton [4].…”

Section: Assess Performancementioning

confidence: 99%

“…We have validated our results with standard benchmark problems in numerical weather prediction and evaluated the performance and strong scalability of the IMEX method using up to 4192 GPUs. The results regarding the implementation of the IMEX methods in NUMA on many-core architectures can be found in Abdi, Giraldo, Constantinescu, Carr III, Wilcox, and Warburton [4].…”

Section: Figure 1: Overview Of Occa Api Kernel Languages and Programentioning

confidence: 99%

“…The largest time-to-solution obtained using 128 MPI processes is about 7% slower than one using purely OpenMP with 128 threads. More details on running NUMA using implicit time-integration on either Nvidia GPUs or Intel's KNL can be found in Abdi, Giraldo, Constantinescu, Carr III, Wilcox, and Warburton [4].…”

Section: Time To Solution Of a 3d Rising Thermal Bubble Problem Is mentioning

confidence: 99%

See 1 more Smart Citation

NPS-NRL-Rice-UIUC Collaboration on Navy Atmosphere-Ocean Coupled Models on Many-Core Computer Architectures Annual Report

Wilcox

Giraldo²,

Campbell³

et al. 2015

View full text Add to dashboard Cite

DynamicsExplicit-in-time CG a continuous Galerkin discretization of the compressible Euler mini-app with explicit time integration;Explicit-in-time DG a discontinuous Galerkin discretization of the compressible Euler mini-app with explicit time integration;Vertically Semi-Implicit CG a continuous Galerkin discretization of the compressible Euler mini-app with vertically implicit semi-implicit time integration;Vertically Semi-Implicit DG a discontinuous Galerkin discretization of the compressible Euler mini-app with vertically implicit semi-implicit time integration;Once the performance of a mini-app is accepted it will be considered for adoption into NUMA. Extending the kernels in NUMA is being lead by Giraldo and his postdoctoral researcher Abdi. We will also make these mini-apps available to the community to be imported into other codes if desired. Wilcox is working closely with Warburton and his team to lead the effort to develop the mini-apps including hand rolled computational kernels optimized for GPU accelerators. These kernels are "hand-written" in OCCA, a library Warburton's group is developing that allows a single kernel to be compiled using many different threading frameworks, such as CUDA, OpenCL, OpenMP, and Pthreads. We are initially developing hand-written kernels to provide a performance target for the Loo.py generated kernels. Parallel communication between computational nodes will use the MPI standard to enable the mini-apps to run on large scale clusters. Using these community standards for parallel programing will allow our mini-apps to be portable to many platforms, however the performance may not be portable across devices. For performance portability, we, lead by Klöckner, are using Loo.py to generate OCCA kernels which can be automatically tuned for current many-core devices along with future ones.The second objective is to expand (as needed by the mini-apps) the OCCA and Loo.py efforts, lead by Warburton and Klöckner. These will be extended naturally in tandem with the requirements that emerge during the development of the mini-apps. We will take a pragmatic approach where features will only be added as they are needed by the mini-apps or will aid in the transition of the mini-app technology back into NUMA. For the sharing part of the objective, both OCCA and Loo.py are open source and can be downloaded from https://github.com/libocca/occa and https://github.com/inducer/loopy, respectively. They are operational and are ready to be evaluated for use in other projects. As it warrants, we will give presentations and demonstrations of the tools to help increase their adoption.The third objective, lead by Campbell, is to implement NEPTUNE as an ESMF component. This will be done in collaboration with the developers of NEPTUNE at NRL Monterey. Once NEPTUNE has been made a component, we will move to running the coupled air-ocean-wave-ice system involving NEPTUNE (with NUMA as its dynamical core), HYCOM, Wavewatch III, and CICE within the Navy ESPC. WORK COMPLETEDIn the course of this project we p...

show abstract

“…Promising approaches for satisfying the latter condition are exponential time integrators [36,47]; (b) to overcome the overly restrictive time-step limitations of EBTI schemes combined with highly scalable horizontal discretizations, either through horizontal/vertical splitting (HEVI) [2,8,40] or through combining SISL PBTI methods with discontinuous Galerkin (DG) discretization [99]; and (c) to further the scalability and the adaptation of algorithms to emerging HPC architectures involving SE [32] or fully-implicit time-stepping approaches [113], and further through exploiting additional parallelism with time-parallel algorithms [33].…”

Section: Discussion and Concluding Remarksmentioning

confidence: 99%

Current and Emerging Time-Integration Strategies in Global Numerical Weather and Climate Prediction

Mengaldo

Wyszogrodzki

Diamantakis

et al. 2018

Arch Computat Methods Eng

View full text Add to dashboard Cite

The continuous partial differential equations governing a given physical phenomenon, such as the Navier-Stokes equations describing the fluid motion, must be numerically discretized in space and time in order to obtain a solution otherwise not readily available in closed (i.e., analytic) form. While the overall numerical discretization plays an essential role in the algorithmic efficiency and physically-faithful representation of the solution, the time-integration strategy commonly is one of the main drivers in terms of cost-to-solution (e.g., time-or energy-to-solution), accuracy and numerical stability, thus constituting one of the key building blocks of the computational model. This is especially true in time-critical applications, including numerical weather prediction (NWP), climate simulations and engineering. This review provides a comprehensive overview of the existing and emerging time-integration (also referred to as time-stepping) practices used in the operational global NWP and climate industry, where global refers to weather and climate simulations performed on the entire globe. While there are many flavors of time-integration strategies, in this review we focus on the most widely adopted in NWP and climate centers and we emphasize the reasons why such numerical solutions were adopted. This allows us to make some considerations on future trends in the field such as the need to balance accuracy in time with substantially enhanced time-to-solution and associated implications on energy consumption and running costs. In addition, the potential for the co-design of time-stepping algorithms and underlying high performance computing hardware, a keystone to accelerate the computational performance of future NWP and climate services, is also discussed in the context of the demanding operational requirements of the weather and climate industry.

show abstract

Acceleration of the IMplicit–EXplicit nonhydrostatic unified model of the atmosphere on manycore processors

Cited by 27 publications

References 43 publications

NCAR Release of CAM‐SE in CESM2.0: A Reformulation of the Spectral Element Dynamical Core in Dry‐Mass Vertical Coordinates With Comprehensive Treatment of Condensates and Energy

NCAR Release of CAM‐SE in CESM2.0: A Reformulation of the Spectral Element Dynamical Core in Dry‐Mass Vertical Coordinates With Comprehensive Treatment of Condensates and Energy

NPS-NRL-Rice-UIUC Collaboration on Navy Atmosphere-Ocean Coupled Models on Many-Core Computer Architectures Annual Report

Current and Emerging Time-Integration Strategies in Global Numerical Weather and Climate Prediction

Contact Info

Product

Resources

About