2013
DOI: 10.1145/2457443.2457447
|View full text |Cite
|
Sign up to set email alerts
|

Floating-Point Exponentiation Units for Reconfigurable Computing

Abstract: The high performance and capacity of current FPGAs makes them suitable as acceleration co-processors. This article studies the implementation, for such accelerators, of the floating-point power function x y as defined by the C99 and IEEE 754-2008 standards, generalized here to arbitrary exponent and mantissa sizes. Last-bit accuracy at the smallest possible cost is obtained thanks to a careful study of the various subcomponents: a floating-point logarithm, a modified floating-point exponential, and a truncated… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
4
4

Relationship

3
5

Authors

Journals

citations
Cited by 16 publications
(4 citation statements)
references
References 36 publications
0
4
0
Order By: Relevance
“…For comparison purposes the table also presents the resource requirement of a state-of-the-art natural logarithm implementation based on piecewise polynomial approximation (PA) available in Altera DSP Builder [6], and with an iterative implementation [7] available in the open source FloPoCo tool [8].…”
Section: Resultsmentioning
confidence: 99%
“…For comparison purposes the table also presents the resource requirement of a state-of-the-art natural logarithm implementation based on piecewise polynomial approximation (PA) available in Altera DSP Builder [6], and with an iterative implementation [7] available in the open source FloPoCo tool [8].…”
Section: Resultsmentioning
confidence: 99%
“…The softmax layer at the end of the classifier and the cross-entropy loss require computation of exponential and logarithmic functions. Since softmax and loss contribute to a negligible portion of the total workload and their accurate calculation requires complicated hardware [9,21], we assign their computation to CPU. Section 4.4 describes the scheduling.…”
Section: Accelerator Design 41 Overviewmentioning
confidence: 99%
“…The power operator could be implemented such as in [4] in the near future. This would allow us to obtain the latency of the two last benchmarks.…”
Section: Benchmarks That Exposed the Limitations Of Hls Toolsmentioning
confidence: 99%