2012
DOI: 10.1016/j.cpc.2011.08.010
|View full text |Cite
|
Sign up to set email alerts
|

Modified block BiCGSTAB for lattice QCD

Abstract: We present results for application of block BiCGSTAB algorithm modified by the QR decomposition and the SAP preconditioner to the Wilson-Dirac equation with multiple right-hand sides in lattice QCD on a 32 3 × 64 lattice at almost physical quark masses. The QR decomposition improves convergence behaviors in the block BiCGSTAB algorithm suppressing deviation between true residual and recursive one. The SAP preconditioner applied to the domain-decomposed lattice helps us minimize communication overhead. We find … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2012
2012
2020
2020

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 16 publications
(16 citation statements)
references
References 15 publications
0
16
0
Order By: Relevance
“…Besides that, in this experiment, the speedup factor is usually 30% higher than the MV ratio, which is robust to the optical property. This extra efficiency can be explained by the better memory cache usage of matrix-matrix multiplication in block BiCGStab as compared to multiple matrix-vector multiplications for multiple sources in sequential BiCGStab [44]. We can observe that the CPU time of the block BiCGStab is less dependent on optical property because it treats multiple right hand sides simultaneously and therefore the solution search through extended Krylov subspace rather than the matrix nature by optical properties is a dominant factor affecting the overall CPU time, whereas the sequential solver deals with multiple right-hand sides individually and thus the individual CPU time that highly depends on the optical properties determines the total CPU time.…”
Section: Numerical Resultsmentioning
confidence: 99%
“…Besides that, in this experiment, the speedup factor is usually 30% higher than the MV ratio, which is robust to the optical property. This extra efficiency can be explained by the better memory cache usage of matrix-matrix multiplication in block BiCGStab as compared to multiple matrix-vector multiplications for multiple sources in sequential BiCGStab [44]. We can observe that the CPU time of the block BiCGStab is less dependent on optical property because it treats multiple right hand sides simultaneously and therefore the solution search through extended Krylov subspace rather than the matrix nature by optical properties is a dominant factor affecting the overall CPU time, whereas the sequential solver deals with multiple right-hand sides individually and thus the individual CPU time that highly depends on the optical properties determines the total CPU time.…”
Section: Numerical Resultsmentioning
confidence: 99%
“…Block solvers have been shown to provide large speedups in two recent lattice QCD studies of inverting the Dirac operator with multiple right hand side (RHS) vectors [21,25]. There are two sources of this speed-up: one is that as the number of RHS vectors (n pf in our case) is increased the number of iterations required for the solver to converge decreases, the other is that applying the Dirac operator to a block of vectors is significantly faster, since the cost of loading the gauge links is amortised over the many RHS vectors, and these data are contiguous allowing better use of the CPU cache.…”
Section: B Block Solversmentioning
confidence: 99%
“…Recently there has been renewed interest [19][20][21][22][23][24][25] in the use of block Krylov solvers [26], which invert the same matrix on multiple vectors simultaneously, and thanks to the enlarged Krylov basis from which solutions are constructed, can converge with significantly fewer iterations than are required to solve each vector separately.…”
mentioning
confidence: 99%
“…(2.2). The inversion algorithm for Dirac matrices is the block solver of [22] which accelerates by a factor of 3 ∼ 4 compared with non-block solvers.…”
Section: Methods For 1+1+1 Flavor Qcd+qed Simulation At the Physical Pmentioning
confidence: 99%