2019
DOI: 10.1177/1094342019849618
|View full text |Cite
|
Sign up to set email alerts
|

Optimizing the HOMME dynamical core for multicore platforms

Abstract: The approach of the next-generation computing platforms offers a tremendous opportunity to advance the state-of-the-art in global atmospheric dynamical models. We detail our incremental approach to utilize this emerging technology by enhancing concurrency within the High-Order Method Modeling Environment (HOMME) atmospheric dynamical model developed at the National Center for Atmospheric Research (NCAR). The study focused on improvements to the performance of HOMME which is a Fortran 90 code with a hybrid (MPI… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
1
1
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 30 publications
0
3
0
Order By: Relevance
“…Lastly, we get consistently lower throughput and higher costs on Cori-KNL than on the other two systems. Our experiment and previous studies (Barnes et al, 2017;Dennis et al, 2019) suggest a few compounding reasons (Appendix D): inefficient memory management for some global arrays, poor vectorization, and less focus on shared-memory parallelism of the CAM5/MPASv4 source code, which are not aligned well with the wider-vector and many-core architecture of KNL. However, the shorter expected queue time on KNL than HW (Figure D1) makes KNL our main system for production.…”
Section: Computational Aspectsmentioning
confidence: 60%
See 2 more Smart Citations
“…Lastly, we get consistently lower throughput and higher costs on Cori-KNL than on the other two systems. Our experiment and previous studies (Barnes et al, 2017;Dennis et al, 2019) suggest a few compounding reasons (Appendix D): inefficient memory management for some global arrays, poor vectorization, and less focus on shared-memory parallelism of the CAM5/MPASv4 source code, which are not aligned well with the wider-vector and many-core architecture of KNL. However, the shorter expected queue time on KNL than HW (Figure D1) makes KNL our main system for production.…”
Section: Computational Aspectsmentioning
confidence: 60%
“…Participation of VR models allows direct and more comprehensive intercomparison of limited-area and global VR models, but requires appropriate adaptations of the experimental protocol and analysis scope to address differences between the two modeling framework, such as the evaluation of soil state and large-scale circulations outside the refinement domain. Having both limited-area and VR models in a coordinated project The overall performance of climate model code is typically limited by memory latency and bandwidth rather than arithmetic speed (e.g., Fuhrer et al, 2018;Dennis et al, 2019), except for some components such as the MG2 microphysics (Barnes et al, 2017). A naive use of all the 68 cores on KNL nodes as MPI ranks lead to 0.5 GB memory per rank (using KNL's two different memory units as a single entity), about one-fifth of 2.7 GB per rank when using 24 MPI ranks per node on Edison.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation