Takeshi Ogita scite author profile

Abstract. Given a vector of floating-point numbers with exact sum s, we present an algorithm for calculating a faithful rounding of s, i.e. the result is one of the immediate floating-point neighbors of s. If the sum s is a floating-point number, we prove that this is the result of our algorithm. The algorithm adapts to the condition number of the sum, i.e. it is fast for mildly conditioned sums with slowly increasing computing time proportional to the logarithm of the condition number. All statements are also true in the presence of underflow. The algorithm does not depend on the exponent range. Our algorithm is fast in terms of measured computing time because it allows good instruction-level parallelism, it neither requires special operations such as access to mantissa or exponent, it contains no branch in the inner loop, nor does it require some extra precision: The only operations used are standard floating-point addition, subtraction and multiplication in one working precision, for example double precision. Certain constants used in the algorithm are proved to be optimal.

show abstract

Accurate Floating-Point Summation Part II: Sign, K-Fold Faithful and Rounding to Nearest

Rump¹,

Ogita²,

Oishi³

2009

SIAM J. Sci. Comput.

View full text Add to dashboard Cite

Abstract. In this Part II of this paper we first refine the analysis of error-free vector transformations presented in Part I. Based on that we present an algorithm for calculating the rounded-to-nearest result of s := p i for a given vector of floatingpoint numbers p i , as well as algorithms for directed rounding. A special algorithm for computing the sign of s is given, also working for huge dimensions. Assume a floating-point working precision with relative rounding error unit eps. We define and investigate a K-fold faithful rounding of a real number r. Basically the result is stored in a vector Resν of K non-overlapping floating-point numbers such thatRes ν approximates r with relative accuracy eps K , and replacing Res K by its floating-point neighbors inRes ν forms a lower and upper bound for r. For a given vector of floating-point numbers with exact sum s, we present an algorithm for calculating a K-fold faithful rounding of s using solely the working precision. Furthermore, an algorithm for calculating a faithfully rounded result of the sum of a vector of huge dimension is presented. Our algorithms are fast in terms of measured computing time because they allow good instruction-level parallelism, they neither require special operations such as access to mantissa or exponent, they contain no branch in the inner loop, nor do they require some extra precision: The only operations used are standard floating-point addition, subtraction and multiplication in one working precision, for example double precision. Certain constants used in the algorithms are proved to be optimal.

show abstract

Error-free transformations of matrix multiplication by using fast routines of matrix multiplication and its applications

et al. 2011

View full text Add to dashboard Cite

This paper is concerned with accurate matrix multiplication in floating-point arithmetic. Recently, an accurate summation algorithm was developed by Rump et al. (SIAM J Sci Comput 31(1):189-224, 2008).The key technique of their method is a fast error-free splitting of floating-point numbers. Using this technique, we first develop an error-free transformation of a product of two floating-point matrices into a sum of floating-point matrices. Next, we partially apply this error-free transformation and develop an algorithm which aims to output an accurate approximation of the matrix product. In addition, an a priori error estimate is given. It is a characteristic of the proposed method that in terms of computation as well as in terms of memory consumption, the dominant part of our algorithm is constituted by ordinary floating-point matrix multiplications. The routine for matrix multiplication is 96 Numer Algor (2012) 59:95-118 highly optimized using BLAS, so that our algorithms show a good computational performance. Although our algorithms require a significant amount of working memory, they are significantly faster than 'gemmx' in XBLAS when all sizes of matrices are large enough to realize nearly peak performance of 'gemm'. Numerical examples illustrate the efficiency of the proposed method.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Takeshi Ogita

Accurate Sum and Dot Product

Accurate Floating-Point Summation Part I: Faithful Rounding

Accurate Floating-Point Summation Part II: Sign, K-Fold Faithful and Rounding to Nearest

Error-free transformations of matrix multiplication by using fast routines of matrix multiplication and its applications

Contact Info

Product

Resources

About