2011 Symposium on Application Accelerators in High-Performance Computing 2011
DOI: 10.1109/saahpc.2011.13
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Implementation of the Overlap Operator on Multi-GPUs

Abstract: Abstract-Lattice QCD calculations were one of the first applications to show the potential of GPUs in the area of high performance computing. Our interest is to find ways to effectively use GPUs for lattice calculations using the overlap operator. The large memory footprint of these codes requires the use of multiple GPUs in parallel. In this paper we show the methods we used to implement this operator efficiently. We run our codes both on a GPU cluster and a CPU cluster with similar interconnects. We find tha… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
26
0

Year Published

2013
2013
2019
2019

Publication Types

Select...
5
3
1

Relationship

3
6

Authors

Journals

citations
Cited by 31 publications
(26 citation statements)
references
References 17 publications
0
26
0
Order By: Relevance
“…Implicitly restarted Arnoldi method [32,33] was used to compute the eigenvalues and eigenvectors of the overlap operator. For all but one ensemble used in this study, it is efficient to first compute the eigenvalues of D † D in a chiral sector, and then reconstruct the eigenvalues of D using standard techniques [34]. For the N = 64 pgQCD lattice at T = 1.12T c , it becomes problematic to distinguish the eigenvalues of near-zero eigenmodes from those of exact zero modes.…”
Section: Appendix B: Scale Invariance and Dirac Spectral Densitymentioning
confidence: 99%
“…Implicitly restarted Arnoldi method [32,33] was used to compute the eigenvalues and eigenvectors of the overlap operator. For all but one ensemble used in this study, it is efficient to first compute the eigenvalues of D † D in a chiral sector, and then reconstruct the eigenvalues of D using standard techniques [34]. For the N = 64 pgQCD lattice at T = 1.12T c , it becomes problematic to distinguish the eigenvalues of near-zero eigenmodes from those of exact zero modes.…”
Section: Appendix B: Scale Invariance and Dirac Spectral Densitymentioning
confidence: 99%
“…Compared to the earlier implementation of the overlap operator [20], the current implementation further improves the performance of data exchange on different nodes of the cluster and uses the polynomial approximation for the overlap operator instead of the rational approximation, and has achieved better scaling and further speed up of the calculation by a factor of two on average [21].…”
Section: Numerical Detailsmentioning
confidence: 99%
“…For neutral particles in a constant electric field the correlation functions still retain their single exponential decay in the limit t → ∞. In particular we have 5) where E(E ) has the perturbative expansion in the electric field given by Table 1). The right panel is the same plot but with the values ofdγ 5 d scaled by a factor of 4.…”
Section: Introductionmentioning
confidence: 92%
“…For the E 220 ensemble we used similar values: m π = 270, 300, 340 and 370 MeV. To increase our efficiency of quark propagator calculations we used an optimally implemented multi-GPU Dslash operator [5], along with an efficient BiCGstab multi-mass inverter [6].…”
Section: Pos(lattice 2013)287mentioning
confidence: 99%