The IEEE 754-2008 Standard governs Floating-Point Arithmetic in all types of Computer Systems. The Standard provides for two radices, 2 and 10. It specifies conversion operations between these radices, but does not allow floatingpoint formats of different radices to be mixed in computational operations. In contrast, the Standard does provide for mixing formats of one radix in one operation. In order to enhance the Standard and make it closed under all basic computational operations, we propose an algorithm for a correctly rounded mixed-radix Fused-Multiply-and-Add (FMA). Our algorithm takes any combination of IEEE754 binary64 and decimal64 numbers in argument and provides a result in IEEE754 binary64 and decimal64, rounded according to any for the five IEEE754 rounding modes. Our implementation does not require any dynamic memory allocation; its runtime can be bounded statically. We compare our implementation to a basic mixed-radix FMA implementation based on the GMP Multiple Precision library. When trying to compile this code e.g. with gcc 6.3.0, the compiler refuses to translate the code, emitting an error message which states that floating-point types of different radices cannot be mixed. The user, who might have no other choice than using and mixing two scientific libraries, one of which produces results in binary floating-point and the other produces decimal ones, might hence be inclined to force the compiler to translate this piece of code, e.g. by adding explicit casts (conversions). This attempt may also be misleading, as these conversions induce errors, which may be insignificant but may also accumulate, amplify and become important. Extending our example code as follows showcases this dangerous [4] behavior: 1 double a=2000.0, c=-2.0, d; 2 _Decimal64 b=0.001D; 3 / * d = a * b + c * / 4 d = __builtin_fma(a,(double)b,c);
Tools that automatically prove the absence or detect the presence of large floating-point roundoff errors or the special values NaN and Infinity greatly help developers to reason about the unintuitive nature of floating-point arithmetic. We show that state-of-the-art tools, however, support or provide non-trivial results only for relatively short programs. We propose a framework for combining different static and dynamic analyses that allows to increase their reach beyond what they can do individually. Furthermore, we show how adaptations of existing dynamic and static techniques effectively trade some soundness guarantees for increased scalability, providing conditional verification of floating-point kernels in realistic programs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.