2022 IEEE 29th Symposium on Computer Arithmetic (ARITH) 2022
DOI: 10.1109/arith54963.2022.00010
|View full text |Cite
|
Sign up to set email alerts
|

MiniFloat-NN and ExSdotp: An ISA Extension and a Modular Open Hardware Unit for Low-Precision Training on RISC-V Cores

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(12 citation statements)
references
References 18 publications
0
12
0
Order By: Relevance
“…These libraries provide various trade-offs. The reader interested in mathematical function algorithms can consult the books by Beebe (2017) and Muller (2016). Tests of recent mathematical libraries can be found in Innocente and Zimmermann (2022).…”
Section: Mathematical Librariesmentioning
confidence: 99%
See 1 more Smart Citation
“…These libraries provide various trade-offs. The reader interested in mathematical function algorithms can consult the books by Beebe (2017) and Muller (2016). Tests of recent mathematical libraries can be found in Innocente and Zimmermann (2022).…”
Section: Mathematical Librariesmentioning
confidence: 99%
“…Since division and square root are significantly slower than the other operations (see Figure 1.1), if we avoid using them, the only functions of one variable we can compute are piecewise polynomials. Hence the mathematical functions are very often approximated by polynomials (Cody and Waite 1980, Muller 2016, Beebe 2017. It is therefore important to be able to tightly control the accuracy of polynomial approximations, as well as the accuracy with which we evaluate polynomials in floating-point arithmetic.…”
Section: Tools For Polynomial Approximation Of Functionsmentioning
confidence: 99%
“…Our FPUs include the ExSdotp extension of [34]. This FPU computational unit implements the widening sum-of-dotproducts operation.…”
Section: Functional Unitsmentioning
confidence: 99%
“…We also achieve high performance with the wid-matmul 16 , a widening matrix multiplication that operates on 16-bit floating point operands and accumulates in a 32-bit register. In this scenario, the 64-bit ExSdotp [34] datapath of each FPU can execute four half-precision FMAs per cycle. For a large 128 × 128 matrix multiplication, Spatz reaches 61.5 FLOPS HP /cycle, an FMA utilization of 96.1%.…”
Section: L1 Instructionmentioning
confidence: 99%
See 1 more Smart Citation