2021
DOI: 10.1021/acs.jctc.1c00720
|View full text |Cite
|
Sign up to set email alerts
|

Faster Self-Consistent Field (SCF) Calculations on GPU Clusters

Abstract: A novel implementation of the self-consistent field (SCF) procedure specifically designed for high-performance execution on multiple graphics processing units (GPUs) is presented. The algorithm offloads to GPUs the three major computational stages of the SCF, namely, the calculation of oneelectron integrals, the calculation and digestion of electron repulsion integrals, and the diagonalization of the Fock matrix, including SCF acceleration via DIIS. Performance results for a variety of test molecules and basis… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
26
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
6

Relationship

1
5

Authors

Journals

citations
Cited by 32 publications
(33 citation statements)
references
References 38 publications
1
26
1
Order By: Relevance
“…Q-Next was developed on top of the EXtreme-scale Electronic Structure System (EXESS) high performance quantum chemistry code-base, ,, which is interfaced with the GAMESS quantum chemistry package. EXESS provides the code for the Fock build on multiple NVIDIA GPUs, which is shared among the Q-Next and traditional DIIS implementations.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Q-Next was developed on top of the EXtreme-scale Electronic Structure System (EXESS) high performance quantum chemistry code-base, ,, which is interfaced with the GAMESS quantum chemistry package. EXESS provides the code for the Fock build on multiple NVIDIA GPUs, which is shared among the Q-Next and traditional DIIS implementations.…”
Section: Methodsmentioning
confidence: 99%
“…While efficient parallel schemes for Fock matrix construction have been devised, , even when using high-performance eigensolvers, the diagonalization of large Fock matrices does not achieve a high parallel efficiency. , Therefore, as computer systems dedicated to scientific calculations move toward massively parallel architectures with hundreds to millions of processor cores, the Fock matrix diagonalization becomes increasingly inefficient in taking advantage of the FLOP capabilities of the hardware and a major impediment to achieving larger molecular sizes in SCF calculations.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The high performance multi-GPU capabilities in the GAMESS/LibCChem [20] suite of programs include a GPU-accelerated Fock build [21], a full implementation of the Self-Consistent-Field (SCF) method [22] including the oneelectron integrals and the Direct-Inversion of the Iterative Subspace (DIIS) algorithm. These routines have been scaled up to the entirety of the Summit supercomputer [23] and have demonstrated excellent parallel efficiency when coupled with fragmentation methods [24] [25].…”
Section: Overview Of Gamess Calculations On Gpusmentioning
confidence: 99%
“…The overarching scheme for the SCF program includes a coordinator/worker dynamic work balancing algorithm, the steps of which are as follows: Firstly, the basis set information is accepted, the shells and shell pairs are constructed. Secondly, the shell pairs are sorted and stored in the binned batch container [22],…”
Section: Overview Of Gamess Calculations On Gpusmentioning
confidence: 99%