Accurate Matrix Multiplication on Binary128 Format Accelerated by Ozaki Scheme

Mukunoki, Daichi; Ozaki, Katsuhisa; Ogita, Takeshi; Imamura, Toshiyuki

doi:10.1145/3472456.3472493

Cited by 5 publications

(3 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It would be necessary to use extended precision in evolution or use a spatial discretization less prone to round-off error to extract the tails for these modes. In this case, methods such as those based on the Ozaki scheme [38] can be used to accelerate DGEMM operations with extended precision on CPU and GPU architectures, but this is beyond the purposes of the present work.) Having validated our code, we investigate the conservation of Noether charges affiliated with this field.…”

Section: Klein-gordon Field In Schwarzschild Spacetimementioning

confidence: 99%

Conservative Evolution of Black Hole Perturbations with Time-Symmetric Numerical Methods

O’Boyle¹,

Markakis²,

Silva³

et al. 2022

Preprint

View full text Add to dashboard Cite

The scheduled launch of the LISA Mission in the next decade has called attention to the gravitational self-force problem. Despite an extensive body of theoretical work, long-time numerical computations of gravitational waves from extreme-massratio-inspirals remain challenging. This work proposes a class of numerical evolution schemes suitable to this problem based on Hermite integration. Their most important feature is time-reversal symmetry and unconditional stability, which enables these methods to preserve symplectic structure, energy, momentum and other Noether charges over long time periods. We apply Noether's theorem to the master fields of black hole perturbation theory on a hyperboloidal slice of Schwarzschild spacetime to show that there exist constants of evolution that numerical simulations must preserve. We demonstrate that time-symmetric integration schemes based on a 2-point Taylor expansion (such as Hermite integration) numerically conserve these quantities, unlike schemes based on a 1-point Taylor expansion (such as Runge-Kutta). This makes timesymmetric schemes ideal for long-time EMRI simulations.

show abstract

Section: Klein-gordon Field In Schwarzschild Spacetimementioning

confidence: 99%

Conservative Evolution of Black Hole Perturbations with Time-Symmetric Numerical Methods

O’Boyle¹,

Markakis²,

Silva³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…The usefulness of the Ozaki scheme [10] is also becoming clear at multiple precision levels owing to the success of Mukunoki et al in accelerating the float128 precision matrix multiplication [11]. The float128 arithmetic supported by GCC, features triple-double to quadruple-double precision performance for addition and multiplication, which is expected to be sufficient for this precision range.…”

Section: Ozaki Schemementioning

confidence: 99%

“…In particular, the recently developed Ozaki scheme is an algorithm that pursues both accuracy and speed by dividing the original matrix into matrices of short mantissa parts, thereby performing fast low-precision matrix multiplication without errors. The Ozaki scheme has already been shown to be effective in float128 calculations [11]; however, since its effectiveness depends on the nature of the matrices used, we must verify the effectiveness of the matrices for concrete problems through benchmark tests. In addition, although there is ongoing research on optimization of real matrix multiplication, there are no comparative studies on the effectiveness of these optimization techniques for multiple precision complex matrix multiplication.…”

Section: Introductionmentioning

confidence: 99%

Acceleration of Multiple Precision Matrix Multiplication Based on Multi-component Floating-Point Arithmetic Using AVX2

Kouya

2021

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Efficient multiple precision linear numerical computation libraries such as MPLAPACK are critical in dealing with ill-conditioned problems. Specifically, there are optimization methods for matrix multiplication, such as the Strassen algorithm and the Ozaki scheme, which can be used to speed up computation. For complex matrix multiplication, the 3M method can also be used, which requires only three multiplications of real matrices, instead of the 4M method, which requires four multiplications of real matrices. In this study, we extend these optimization methods to arbitrary precision complex matrix multiplication and verify the possible increase in computation speed through benchmark tests. The optimization methods are also applied to complex LU decomposition using matrix multiplication to demonstrate that the Ozaki scheme can be used to achieve higher computation speeds.

show abstract

Acceleration of Matrix Multiplication Based on Triple-Double (TD), and Triple-Single (TS) Precision Arithmetic

Utsugiri

Kouya²

2022

Computational Science and Its Applications – ICCSA 2022 Workshops

View full text Add to dashboard Cite

Accurate Matrix Multiplication on Binary128 Format Accelerated by Ozaki Scheme

Cited by 5 publications

References 19 publications

Conservative Evolution of Black Hole Perturbations with Time-Symmetric Numerical Methods

Conservative Evolution of Black Hole Perturbations with Time-Symmetric Numerical Methods

Acceleration of Multiple Precision Matrix Multiplication Based on Multi-component Floating-Point Arithmetic Using AVX2

Acceleration of Matrix Multiplication Based on Triple-Double (TD), and Triple-Single (TS) Precision Arithmetic

Contact Info

Product

Resources

About