Proceedings of the 2018 International Conference on Supercomputing 2018
DOI: 10.1145/3205289.3205296
|View full text |Cite
|
Sign up to set email alerts
|

Optimizing Tensor Contractions in CCSD(T) for Efficient Execution on GPUs

Abstract: Tensor contractions are higher dimensional analogs of matrix multiplications, used in many computational contexts such as high order models in quantum chemistry, deep learning, finite element methods etc. In contrast to the wide availability of high-performance libraries for matrix multiplication on GPUs, the same is not true for tensor contractions. In this paper, we address the optimization of a set of symmetrized tensor contractions that form the computational bottleneck in the CCSD(T) coupled-cluster metho… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 12 publications
(6 citation statements)
references
References 33 publications
(38 reference statements)
0
6
0
Order By: Relevance
“…al. [24] improves tensor contractions for coupled cluster methods in quantum chemistry by fusing multiple contractions. However, their approach performs transpose in shared memory and these tensor contractions are different from contractions in Kron-Matmul.…”
Section: Related Workmentioning
confidence: 99%
“…al. [24] improves tensor contractions for coupled cluster methods in quantum chemistry by fusing multiple contractions. However, their approach performs transpose in shared memory and these tensor contractions are different from contractions in Kron-Matmul.…”
Section: Related Workmentioning
confidence: 99%
“…For example, the current V100 GPUs group 32 cores into a warp, and there are 160 warps to a GPU 54,55 . Various existing computational chemistry codes have been adapted 53,[56][57][58][59][60][61][62][63][64][65][66][67] or designed from the outset (e.g., TeraChem) 67,68 to use GPUs.…”
Section: Hardware and Software Evolution Challengesmentioning
confidence: 99%
“…High throughput is achieved by operating many warps simultaneously. For example, the current V100 GPUs group 32 cores into a warp, and there are 160 warps to a GPU. , Various existing computational chemistry codes have been adapted , or designed from the outset (e.g., TeraChem) , to use GPUs.…”
Section: Hardware and Software Evolution Challengesmentioning
confidence: 99%
“…Sparse tensor contraction. Dense tensor contraction has been studied for decades on diverse hardware platforms [5,19,21,27,28,32,34,42,50,65,72,73], in scientific computing including chemistry, physics, and mechanics. The state-of-the-art studies focus on block-sparse tensor contractions with dense blocks in tensors.…”
Section: Related Workmentioning
confidence: 99%