DynamicsExplicit-in-time CG a continuous Galerkin discretization of the compressible Euler mini-app with explicit time integration;Explicit-in-time DG a discontinuous Galerkin discretization of the compressible Euler mini-app with explicit time integration;Vertically Semi-Implicit CG a continuous Galerkin discretization of the compressible Euler mini-app with vertically implicit semi-implicit time integration;Vertically Semi-Implicit DG a discontinuous Galerkin discretization of the compressible Euler mini-app with vertically implicit semi-implicit time integration;Once the performance of a mini-app is accepted it will be considered for adoption into NUMA. Extending the kernels in NUMA is being lead by Giraldo and his postdoctoral researcher Abdi. We will also make these mini-apps available to the community to be imported into other codes if desired. Wilcox is working closely with Warburton and his team to lead the effort to develop the mini-apps including hand rolled computational kernels optimized for GPU accelerators. These kernels are "hand-written" in OCCA, a library Warburton's group is developing that allows a single kernel to be compiled using many different threading frameworks, such as CUDA, OpenCL, OpenMP, and Pthreads. We are initially developing hand-written kernels to provide a performance target for the Loo.py generated kernels. Parallel communication between computational nodes will use the MPI standard to enable the mini-apps to run on large scale clusters. Using these community standards for parallel programing will allow our mini-apps to be portable to many platforms, however the performance may not be portable across devices. For performance portability, we, lead by Klöckner, are using Loo.py to generate OCCA kernels which can be automatically tuned for current many-core devices along with future ones.The second objective is to expand (as needed by the mini-apps) the OCCA and Loo.py efforts, lead by Warburton and Klöckner. These will be extended naturally in tandem with the requirements that emerge during the development of the mini-apps. We will take a pragmatic approach where features will only be added as they are needed by the mini-apps or will aid in the transition of the mini-app technology back into NUMA. For the sharing part of the objective, both OCCA and Loo.py are open source and can be downloaded from https://github.com/libocca/occa and https://github.com/inducer/loopy, respectively. They are operational and are ready to be evaluated for use in other projects. As it warrants, we will give presentations and demonstrations of the tools to help increase their adoption.The third objective, lead by Campbell, is to implement NEPTUNE as an ESMF component. This will be done in collaboration with the developers of NEPTUNE at NRL Monterey. Once NEPTUNE has been made a component, we will move to running the coupled air-ocean-wave-ice system involving NEPTUNE (with NUMA as its dynamical core), HYCOM, Wavewatch III, and CICE within the Navy ESPC.
WORK COMPLETEDIn the course of this project we p...