An efficient MPI/openMP parallelization of the Hartree-Fock method for the second generation of Intel<sup>®</sup>Xeon Phi<sup>™</sup>processor

Mironov, Vladimir; Alexeev, Yuri; Keipert, Kristopher; D‘Mello, Michael; Moskovsky, Alexander; Gordon, Mark S.

doi:10.1145/3126908.3126956

Cited by 15 publications

(21 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In addition, KNL-specific optimizations and tuning are proposed in [63] for seismic computations, with detailed comparisons between KNC and Haswell. In [78], the performance of the hybrid MPI+OpenMP programming paradigm is provided on Theta supercomputer based on KNL compute nodes. The many-core scalability of the edgebased graph coloring algorithm performance is provided in [79], which targets both GPU as well as KNC.…”

Section: State-of-the-art Shared-memory Optimizationsmentioning

confidence: 99%

Optimizations of Unstructured Aerodynamics Computations for Many-core Architectures

Farhan

Keyes

2018

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

Section: State-of-the-art Shared-memory Optimizationsmentioning

confidence: 99%

Optimizations of Unstructured Aerodynamics Computations for Many-core Architectures

Farhan

Keyes

2018

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

“…It also eliminates intranode MPI communications. This hybrid MPI+OpenMP parallelization has been recently implemented in GAMESS …”

Section: Methodsmentioning

confidence: 99%

“…This matrix (Equation ) is composed of the internal contribution and ESP. The former is the usual Fock matrix as in non‐FMO calculations, and its parallelization is described elsewhere . Therefore, the description below (Sections 2.2.1 and 2.2.2) focuses on ESP, which is FMO specific, and according to Equation consists from one‐electron and two‐electron contributions.…”

Section: Methodsmentioning

confidence: 99%

“…The scheme in Algorithm 2 is a modification of the Fock matrix calculation . It contains nested loops over shells of fragment X (indices k and l ) and monomer L (indices i and j ).…”

Section: Methodsmentioning

confidence: 99%

“…This model executes multiple copies of GAMESS program per node, which is not very efficient on CPUs with a very large number of cores (eg, on Intel Xeon Phi processors 128 MPI processes per node are executed). To improve the memory usage and increase the computational efficiency, an OpenMP version of GAMESS has been developed independent of the two older OpenMP codes of GAMESS on the K computer and Cray XT5…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Multithreaded parallelization of the energy and analytic gradient in the fragment molecular orbital method

Mironov

Alexeev

Fedorov

2019

Int J of Quantum Chemistry

Self Cite

View full text Add to dashboard Cite

The fragment molecular orbital method in GAMESS is parallelized in a multithreaded OpenMP implementation combined with the MPI version of the two‐level generalized distributed data interface. The energy and analytic gradient in gas phase and the polarizable continuum model of solvation are parallelized in this hybrid three‐level scheme, achieving a large memory footprint reduction and a high parallel efficiency on Intel Xeon Phi processors. The parallel efficiency is demonstrated on the Stampede2 and Theta supercomputers using up to 2048 nodes (262 144 threads).

show abstract

Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part II)

Chapman

Pham

Yang

et al. 2021

Lecture Notes in Computer Science

View full text Add to dashboard Cite

An efficient MPI/openMP parallelization of the Hartree-Fock method for the second generation of Intel^®Xeon Phi^™processor

Cited by 15 publications

References 26 publications

Optimizations of Unstructured Aerodynamics Computations for Many-core Architectures

Optimizations of Unstructured Aerodynamics Computations for Many-core Architectures

Multithreaded parallelization of the energy and analytic gradient in the fragment molecular orbital method

Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part II)

Contact Info

Product

Resources

About

An efficient MPI/openMP parallelization of the Hartree-Fock method for the second generation of Intel®Xeon Phi™processor

Cited by 15 publications

References 26 publications

Optimizations of Unstructured Aerodynamics Computations for Many-core Architectures

Optimizations of Unstructured Aerodynamics Computations for Many-core Architectures

Multithreaded parallelization of the energy and analytic gradient in the fragment molecular orbital method

Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Model (Part II)

Contact Info

Product

Resources

About

An efficient MPI/openMP parallelization of the Hartree-Fock method for the second generation of Intel^®Xeon Phi^™processor