2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops &Amp; PhD Forum 2012
DOI: 10.1109/ipdpsw.2012.175
|View full text |Cite
|
Sign up to set email alerts
|

Implementation and Evaluation of Triple Precision BLAS Subroutines on GPUs

Abstract: We implemented and evaluated the triple precision Basic Linear Algebra Subprograms (BLAS) subroutines, AXPY, GEMV and GEMM on a Tesla C2050. In this paper, we present a Double+Single (D+S) type triple precision floating-point value format and operations. They are based on techniques similar to Double-Double (DD) type quadruple precision operations. On the GPU, the D+S-type operations are more costly than the DD-type operations in theory and in practice. Therefore, the triple precision GEMM, which is a compute-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
5
0

Year Published

2012
2012
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(5 citation statements)
references
References 8 publications
0
5
0
Order By: Relevance
“…Triple precision (double + single float) implementations of BLAS routines on GPUs were presented in [16]. Related to polynomial system solving on a GPU, we mention two recent works.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Triple precision (double + single float) implementations of BLAS routines on GPUs were presented in [16]. Related to polynomial system solving on a GPU, we mention two recent works.…”
Section: Related Workmentioning
confidence: 99%
“…Triple precision (double + single float) implementations of BLAS routines on GPUs were presented in [16].…”
Section: Related Workmentioning
confidence: 99%
“…One such approach is the double-double and quad-double precision, where a single value is represented as the sum of two and four FP64 values, respectively, and arithmetic operations are performed using a sequence of FP64 operations (Hida et al, 2001). GEMM and other BLAS functions for double-double precision have been evaluated on NVIDIA GPUs and AMD Cypress GPUs (Mukunoki and Takahashi, 2012;Nakasato, 2011). Another approach is the Ozaki scheme, which also splits a value into multiple lower-precision values (Ozaki et al, 2012(Ozaki et al, , 2013.…”
Section: Introductionmentioning
confidence: 99%
“…The suitability of double double and triple precision Basic Linear Algebra Subroutines (BLAS) was shown in [15,16].…”
Section: Introductionmentioning
confidence: 99%