2020
DOI: 10.1007/978-3-030-43229-4_44
|View full text |Cite
|
Sign up to set email alerts
|

Reproducible BLAS Routines with Tunable Accuracy Using Ozaki Scheme for Many-Core Architectures

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
19
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
3

Relationship

1
5

Authors

Journals

citations
Cited by 14 publications
(19 citation statements)
references
References 8 publications
0
19
0
Order By: Relevance
“…Recently, Mukunoki and Ogita presented their approach to implement reproducible BLAS, called OzBLAS [18], with tunable accuracy. This approach is different from both ReproBLAS and ExBLAS as it does not require to implement every BLAS routine from scratch but relies on high-performance (vendor) implementations.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Recently, Mukunoki and Ogita presented their approach to implement reproducible BLAS, called OzBLAS [18], with tunable accuracy. This approach is different from both ReproBLAS and ExBLAS as it does not require to implement every BLAS routine from scratch but relies on high-performance (vendor) implementations.…”
Section: Related Workmentioning
confidence: 99%
“…It has led to the inclusion of error-free transformations (EFTs) for addition and multiplication -to return the exact outcome as the result and the error -to assure numerical reproducibility of floating-point operations, into the revised version of the standard. These mechanisms, once implemented in hardware, will simplify our reproducible algorithms -like the ones used in the ExBLAS [6], ReproBLAS [7], OzBLAS [18] libraries -and boost their performance. There are two approaches that enable the addition of floating-point numbers without incurring round-off errors or with reducing their impact.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, Mukunoki and Ogita presented their approach to implement reproducible BLAS, called OzBLAS Mukunoki et al (2020), with tunable accuracy. This approach is different from both ReproBLAS and ExBLAS as it does not require to implement every BLAS routine from scratch but relies on high-performance (vendor) implementations.…”
Section: Related Workmentioning
confidence: 99%
“…Ensuring the bit-wise reproducibility is often a complex and expensive task that imposes modifications to the algorithm and its underlying parts such as the BLAS (Basic Linear Algebra Subprograms) routines Lawson et al (1979); Dongarra et al (1990). These modifications are necessary to preserve every bit of information (both result and error) Collange et al (2015) or, alternatively, to cut off some parts of the data and operate on the remaining most significant parts Mukunoki et al (2020); Demmel and Nguyen (2015). Furthermore, the bit-wise reproducibility can become expensive with the overhead of at least 8% for parallel reduction Collange et al (2015); Demmel and Nguyen (2015), up to 2x–4x for matrix-vector product Iakymchuk et al (2019b), and more than 10x for matrix–matrix multiplication Iakymchuk et al (2016).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation