50th International Conference on Parallel Processing 2021
DOI: 10.1145/3472456.3472493
|View full text |Cite
|
Sign up to set email alerts
|

Accurate Matrix Multiplication on Binary128 Format Accelerated by Ozaki Scheme

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 19 publications
0
3
0
Order By: Relevance
“…It would be necessary to use extended precision in evolution or use a spatial discretization less prone to round-off error to extract the tails for these modes. In this case, methods such as those based on the Ozaki scheme [38] can be used to accelerate DGEMM operations with extended precision on CPU and GPU architectures, but this is beyond the purposes of the present work.) Having validated our code, we investigate the conservation of Noether charges affiliated with this field.…”
Section: Klein-gordon Field In Schwarzschild Spacetimementioning
confidence: 99%
“…It would be necessary to use extended precision in evolution or use a spatial discretization less prone to round-off error to extract the tails for these modes. In this case, methods such as those based on the Ozaki scheme [38] can be used to accelerate DGEMM operations with extended precision on CPU and GPU architectures, but this is beyond the purposes of the present work.) Having validated our code, we investigate the conservation of Noether charges affiliated with this field.…”
Section: Klein-gordon Field In Schwarzschild Spacetimementioning
confidence: 99%
“…The usefulness of the Ozaki scheme [10] is also becoming clear at multiple precision levels owing to the success of Mukunoki et al in accelerating the float128 precision matrix multiplication [11]. The float128 arithmetic supported by GCC, features triple-double to quadruple-double precision performance for addition and multiplication, which is expected to be sufficient for this precision range.…”
Section: Ozaki Schemementioning
confidence: 99%
“…In particular, the recently developed Ozaki scheme is an algorithm that pursues both accuracy and speed by dividing the original matrix into matrices of short mantissa parts, thereby performing fast low-precision matrix multiplication without errors. The Ozaki scheme has already been shown to be effective in float128 calculations [11]; however, since its effectiveness depends on the nature of the matrices used, we must verify the effectiveness of the matrices for concrete problems through benchmark tests. In addition, although there is ongoing research on optimization of real matrix multiplication, there are no comparative studies on the effectiveness of these optimization techniques for multiple precision complex matrix multiplication.…”
Section: Introductionmentioning
confidence: 99%