2023
DOI: 10.1063/5.0151070
|View full text |Cite
|
Sign up to set email alerts
|

Distributed memory, GPU accelerated Fock construction for hybrid, Gaussian basis density functional theory

Abstract: With the growing reliance of modern supercomputers on accelerator-based architecture such a graphics processing units (GPUs), the development and optimization of electronic structure methods to exploit these massively parallel resources has become a recent priority. While significant strides have been made in the development GPU accelerated, distributed memory algorithms for many modern electronic structure methods, the primary focus of GPU development for Gaussian basis atomic orbital methods has been for sha… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5

Relationship

1
4

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 117 publications
0
2
0
Order By: Relevance
“…Thus, the question remains: what is a better way to evaluate 2-electron integrals over high- l Gaussian AOs on GPUs? Our recent work on the density-fitting-accelerated J-matrix engine implementation for GPUs based on McMurchie–Davidson (MD) recurrences hinted that the MD recurrences recast in matrix form might be the way to go for some classes of integrals; that development also provided many fundamental elements that we reused in this work. We were not alone in thinking that the SHARK integral engine developed by Frank Neese and recently incorporated into the public release of ORCA program illustrated how efficient the MD scheme can be when expressed as a matrix multiplication (matmul) on conventional CPUs, at least when used for integrals over high angular momenta (SHARK supplements the MD approach with traditional Obara-Saika-based kernels implemented in the Libint library).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Thus, the question remains: what is a better way to evaluate 2-electron integrals over high- l Gaussian AOs on GPUs? Our recent work on the density-fitting-accelerated J-matrix engine implementation for GPUs based on McMurchie–Davidson (MD) recurrences hinted that the MD recurrences recast in matrix form might be the way to go for some classes of integrals; that development also provided many fundamental elements that we reused in this work. We were not alone in thinking that the SHARK integral engine developed by Frank Neese and recently incorporated into the public release of ORCA program illustrated how efficient the MD scheme can be when expressed as a matrix multiplication (matmul) on conventional CPUs, at least when used for integrals over high angular momenta (SHARK supplements the MD approach with traditional Obara-Saika-based kernels implemented in the Libint library).…”
Section: Introductionmentioning
confidence: 99%
“…A possible workaround for the challenge of high-l integrals is the use of real-space factorization of 2-electron integrals, as illustrated recently by us and collaborators 24 via the use of realspace quadrature ("pseudospectral", 25 also known as chain-ofspheres 26 or seminumerical 27 ) approximation to the exact exchange, which trades the problem of computing 4-center 2electron integrals for evaluation of cheaper but much more numerous 2-center 1-electron Gaussian AO integrals. Nevertheless, these and other numerical approximations that avoid the 4-center integrals cannot entirely eliminate the need to evaluate 4-center 2-electron integrals; thus, their efficient evaluation, especially for high angular momenta, remains a critical challenge on modern HPC platforms.…”
Section: Introductionmentioning
confidence: 99%