MiniFloat-NN and ExSdotp: An ISA Extension and a Modular Open Hardware Unit for Low-Precision Training on RISC-V Cores

Bertaccini, Luca; Paulin, Gianna; Fischer, Tim; Mach, Stefan; Benini, Luca

doi:10.1109/arith54963.2022.00010

Cited by 7 publications

(12 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These libraries provide various trade-offs. The reader interested in mathematical function algorithms can consult the books by Beebe (2017) and Muller (2016). Tests of recent mathematical libraries can be found in Innocente and Zimmermann (2022).…”

Section: Mathematical Librariesmentioning

confidence: 99%

“…Since division and square root are significantly slower than the other operations (see Figure 1.1), if we avoid using them, the only functions of one variable we can compute are piecewise polynomials. Hence the mathematical functions are very often approximated by polynomials (Cody and Waite 1980, Muller 2016, Beebe 2017. It is therefore important to be able to tightly control the accuracy of polynomial approximations, as well as the accuracy with which we evaluate polynomials in floating-point arithmetic.…”

Section: Tools For Polynomial Approximation Of Functionsmentioning

confidence: 99%

See 1 more Smart Citation

Floating-point arithmetic

Boldo¹,

Jeannerod²,

Melquiond³

et al. 2023

Acta Numerica

View full text Add to dashboard Cite

Floating-point numbers have an intuitive meaning when it comes to physics-based numerical computations, and they have thus become the most common way of approximating real numbers in computers. The IEEE-754 Standard has played a large part in making floating-point arithmetic ubiquitous today, by specifying its semantics in a strict yet useful way as early as 1985. In particular, floating-point operations should be performed as if their results were first computed with an infinite precision and then rounded to the target format. A consequence is that floating-point arithmetic satisfies the ‘standard model’ that is often used for analysing the accuracy of floating-point algorithms. But that is only scraping the surface, and floating-point arithmetic offers much more.In this survey we recall the history of floating-point arithmetic as well as its specification mandated by the IEEE-754 Standard. We also recall what properties it entails and what every programmer should know when designing a floating-point algorithm. We provide various basic blocks that can be implemented with floating-point arithmetic. In particular, one can actually compute the rounding error caused by some floating-point operations, which paves the way to designing more accurate algorithms. More generally, properties of floating-point arithmetic make it possible to extend the accuracy of computations beyond working precision.

show abstract

Section: Mathematical Librariesmentioning

confidence: 99%

Section: Tools For Polynomial Approximation Of Functionsmentioning

confidence: 99%

Floating-point arithmetic

Boldo¹,

Jeannerod²,

Melquiond³

et al. 2023

Acta Numerica

View full text Add to dashboard Cite

show abstract

“…Our FPUs include the ExSdotp extension of [34]. This FPU computational unit implements the widening sum-of-dotproducts operation.…”

Section: Functional Unitsmentioning

confidence: 99%

“…We also achieve high performance with the wid-matmul 16 , a widening matrix multiplication that operates on 16-bit floating point operands and accumulates in a 32-bit register. In this scenario, the 64-bit ExSdotp [34] datapath of each FPU can execute four half-precision FMAs per cycle. For a large 128 × 128 matrix multiplication, Spatz reaches 61.5 FLOPS HP /cycle, an FMA utilization of 96.1%.…”

Section: L1 Instructionmentioning

confidence: 99%

“…For a fair comparison, we ported the Snitch cluster [21] to GlobalFoundries' 12LPP technology, with the same frequency target as Spatz. Furthermore, we also added packed-SIMD support to the FPUs of the Snitch cluster [8], [34]. Figure 14a shows the area efficiency of the Spatz-based and SSR-based clusters on a set of data-parallel workloads.…”

Section: E Ppa Comparison With the Ssr-based Clustermentioning

confidence: 99%

See 1 more Smart Citation

Spatz

Cavalcante

Wüthrich

Perotti

et al. 2022

Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design

View full text Add to dashboard Cite

Exact Fused Dot Product Add Operators

Desrentes,

de Dinechin,

de Dinechin

2023

2023 IEEE 30th Symposium on Computer Arithmetic (ARITH)

View full text Add to dashboard Cite

This article explores architectures of exact (correctly rounded) fused dot product and add operators suitable for the FP32 and FP64 binary floating-point representations with subnormal support, and other representations with a wide dynamic range such as bfloat16. The exact summation of terms before rounding requires a full-size accumulator, and this work discusses techniques to compress the identical bits of this accumulator. This requires the computation of the relative shift amounts of the terms, which is formulated as a parallel prefix algorithm, allowing for a low-latency implementation. Architectural options for the exact fused dot product and add operators with up to 16 products for FP32, FP64 and mixed-precision BF16 to FP32 are evaluated using the TSMC 16FFC technology node.

show abstract

MiniFloat-NN and ExSdotp: An ISA Extension and a Modular Open Hardware Unit for Low-Precision Training on RISC-V Cores

Cited by 7 publications

References 18 publications

Floating-point arithmetic

Floating-point arithmetic

Spatz

Exact Fused Dot Product Add Operators

Contact Info

Product

Resources

About