Efficient Large Integer Squarers on FPGA

Xu, Simin; Fahmy, Suhaib A.; McLoughlin, Ian

doi:10.1109/fccm.2013.35

Cited by 3 publications

(3 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Given a number M [0 : −n+1] of n bits, and making a = M [0 : −n/2+1] and b = M [−n/2 : −n + 1], each partial product has 1/4 of the complexity of the full product, and a reduction in circuit area of nearly 25% is achieved. For large values of n, breaking up M in 3 or more chunks allows for additional gains as shown for Xilinx FPGAs in [14].…”

Section: Squaringmentioning

confidence: 99%

Floating Point Calculation of the Cube Function on FPGAs

Osorio¹

2023

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

Specialized arithmetic units allow fast and efficient computation of lesser used mathematical functions. The overall impact of those units would be negligible in a general purpose processor, as added circuitry makes chips more complex despite most software would seldom make use of it. On the opposite side, custom computing machines are built for a specific task, and they can always benefit from specialized units if they are available. In this work, floating point architectures are proposed for computing the cube on Intel and Xilinx FPGAs. Those implementations reduce the cost and latency compared to using simple floating point multiplications and squarers.

show abstract

Section: Squaringmentioning

confidence: 99%

Floating Point Calculation of the Cube Function on FPGAs

Osorio¹

2023

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

show abstract

“…Various previous works cover the efficient realization of squarers on FPGAs by improving the utilization of the DSP units and handling the weighting between DSP and LUT resources [13]- [15]. In the work of Lee and Burgess [14], a combination of efficient 2×k multipliers together with a single DSP block is used to reduce its complexity.…”

Section: Introductionmentioning

confidence: 99%

“…In the work of Lee and Burgess [14], a combination of efficient 2×k multipliers together with a single DSP block is used to reduce its complexity. The works in [13] and [15] address large squarer design by using modifications of the Karatsuba-Ofman algorithm to save DSPs.…”

Section: Introductionmentioning

confidence: 99%

Resource Optimal Squarers for FPGAs

Böttcher

Kumm

Dinechin

2022

2022 32nd International Conference on Field-Programmable Logic and Applications (FPL)

View full text Add to dashboard Cite

Squaring is an essential operation in computer arithmetic that can be considered as a special case of multiplication where several simplifications can be applied to reduce the complexity of the resulting circuit. However, the design of a squarer is not straightforward for modern FPGAs that provide embedded DSP blocks and look-up-tables (LUTs). This work proposes a flexible method to design resource optimal squarers, i.e., a squarer that uses a minimum number of LUTs for a userdefined number of DSP blocks. The method uses an integer linear programming (ILP) formulation based on a generalization of multiplier tiling. It is shown that the proposed squarer design method significantly improves the LUT utilization for a given number of DSPs over previous methods, while maintaining a similar critical path delay and latency.

show abstract

Square-rich fixed point polynomial evaluation on FPGAs

Xu¹,

Fahmy

McLoughlin

2014

Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

Self Cite

View full text Add to dashboard Cite

Polynomial evaluation is important across a wide range of application domains, so significant work has been done on accelerating its computation. The conventional algorithm, referred to as Horner's rule, involves the least number of steps but can lead to increased latency due to serial computation. Parallel evaluation algorithms such as Estrin's method have shorter latency than Horner's rule, but achieve this at the expense of large hardware overhead. This paper presents an efficient polynomial evaluation algorithm, which reforms the evaluation process to include an increased number of squaring steps. By using a squarer design that is more efficient than general multiplication, this can result in polynomial evaluation with a 57.9% latency reduction over Horner's rule and 14.6% over Estrin's method, while consuming less area than Horner's rule, when implemented on a Xilinx Virtex 6 FPGA. When applied in fixed point function evaluation, where precision requirements limit the rounding of operands, it still achieves a 52.4% performance gain compared to Horner's rule with only a 4% area overhead in evaluating 5 th degree polynomials.

show abstract

Efficient Large Integer Squarers on FPGA

Cited by 3 publications

References 9 publications

Floating Point Calculation of the Cube Function on FPGAs

Floating Point Calculation of the Cube Function on FPGAs

Resource Optimal Squarers for FPGAs

Square-rich fixed point polynomial evaluation on FPGAs

Contact Info

Product

Resources

About