Area-Optimized Low-Latency Approximate Multipliers for FPGA-based Hardware Accelerators

Ullah, Salim; Rehman, Semeen; Prabakaran, Bharath Srinivas; Kriebel, Florian; Hanif, Muhammad Abdullah; Shafique, Muhammad; Kumar, Akash

doi:10.1109/dac.2018.8465781

Cited by 20 publications

(23 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We have evaluated SIMDive against five SIMD and SISD accurate and approximate cutting-edge multipliers and dividers: performance-optimized accurate IPs of multiplier [36] and divider [37], provided by Xilinx Vivado, Mitchell [22], SoAs MBM [28], INZeD [29] and AAXD dividers [29] as they have the best resource-error trade-off when compared to the rest of designs ( [9,13,20,33]), CA [30] (based on approximate 4x4 multipliers) customized for FPGAs, and truncated multiplier (with 7x7 or 15x7 as the basic multiplier, the more accurate one is also exploited in SIMD structure). Note, hierarchical SIMD divider is not mathematically feasible by decomposing large one to small instances.…”

Section: Results and Discussion 41 Experimental Setupmentioning

confidence: 99%

“…In particular, delay and energy are improved by 4× and 4.6×, respectively, in our proposed divider in SISD mode, as compared to accurate counterpart. In contrast, CA [30] with hierarchical implementation approach dissipates even more energy with lower throughput than accurate multiplier.…”

Section: Simulation and Synthesis Resultsmentioning

confidence: 99%

“…Nevertheless, in spite of their advantages, hosting off-the-shelf fixed-precision DSP blocks falls short on fulfilling design requirements in a variety of domains. Beside being unable to perform division, some shortcomings that testify on their inefficiency are: 1) their fixed locations in FPGAs impose routing complexity and often results in degraded performance of some circuits [17] (and Viterbi decoder, Reed-Solomon and JPEG encoders discussed in [30]); 2) unable to be efficiently-utilized for multiplication precision below 18-bit [6,19] (the comparable performance and better energy-efficiency of small-scale LUT-based multipliers over DSP blocks further encourages their deployment in e.g. neural networks) 3) their limited ratio versus LUTs (<0.001) in multiplication-intensive applications or concurrently executing programs.…”

Section: Introductionmentioning

confidence: 99%

“…These techniques are not generic since approximation principles (as defined for ASIC) neglect differences in the underlying reconfigurable infrastructure and yield insignificant improvements when directly synthesized and ported to FPGAs [31]. Few designs have targeted FPGAs which are either approximate SISD [30,31] or accurate SIMD multipliers [15,18,[24][25][26]. Moreover, lack of support for division in such architecture imposes substantial overhead on the design.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

SIMDive: Approximate SIMD Soft Multiplier-Divider for FPGAs with Tunable Accuracy

Ebrahimi

Ullah

Kumar

2020

Proceedings of the 2020 on Great Lakes Symposium on VLSI

Self Cite

View full text Add to dashboard Cite

The ever-increasing quest for data-level parallelism and variable precision in ubiquitous multimedia and Deep Neural Network (DNN) applications has motivated the use of Single Instruction, Multiple Data (SIMD) architectures. To alleviate energy as their main resource constraint, approximate computing has re-emerged, albeit mainly specialized for their Application-Specific Integrated Circuit (ASIC) implementations. This paper, presents for the first time, an SIMD architecture based on novel multiplier and divider with tunable accuracy, targeted for Field-Programmable Gate Arrays (FPGAs). The proposed hybrid architecture implements Mitchell's algorithms and supports precision variability from 8 to 32 bits. Experimental results obtained from Vivado, multimedia and DNN applications indicate superiority of proposed architecture (both SISD and SIMD) over accurate and state-of-the-art approximate counterparts. In particular, the proposed SISD divider outperforms the accurate Intellectual Property (IP) divider provided by Xilinx with 4× higher speed and 4.6×less energy and tolerating only <0.8% error. Moreover, the proposed SIMD multiplier-divider supersede accurate SIMD multiplier by achieving up to 26%, 45%, 36%, and 56% improvement in area, throughput, power, and energy, respectively.

show abstract

Section: Results and Discussion 41 Experimental Setupmentioning

confidence: 99%

Section: Simulation and Synthesis Resultsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

SIMDive: Approximate SIMD Soft Multiplier-Divider for FPGAs with Tunable Accuracy

Ebrahimi

Ullah

Kumar

2020

Proceedings of the 2020 on Great Lakes Symposium on VLSI

Self Cite

View full text Add to dashboard Cite

show abstract

“…On FPGAs, approximate multipliers have to be implemented in logic and are not as fast as dedicated multiplication circuits for many input sizes. In state-of-the-art libraries such as [26,34,35], even small 8×8 multipliers show delays between 5ns and 12.5ns, which corresponds to a maximum operating frequency between 80MHz and 200MHz. The targeted color space conversion employs larger multiplications of size 12×x, likely limiting performance even further.…”

Section: Color Space Conversionmentioning

confidence: 99%

Region of Interest-Based Parameter Optimization for Approximate Image Processing on FPGAs

Manuel

Kreddig²,

Conrady

et al. 2021

IJNC

View full text Add to dashboard Cite

In recent years, technological advancements in computer hardware systems have been lagging behind the demand for increased computational power, especially in application domains such as signal and image processing. Approximate computing is a design paradigm for efficient system design to overcome this bottleneck by exploiting the resilience of such applications to inaccuracy in their computations and trading off quality for hardware resource savings. Over the years, many approximation techniques have been proposed on various abstraction layers and demonstrated their effectiveness in different applications. Combining multiple methods in a larger system can further increase the resulting benefits. However, this often leads to a non-trivial optimization task of finding the best parameterization across all employed methods. The interaction and influence of error propagation between individual components demand a global optimization of parameters that simultaneously considers all the parameters for each of the approximation techniques used. In this work, we propose a methodology for exploring such highly complex design spaces using a multi-objective genetic algorithm in an FPGA-based system. Simple models are used for the estimation of resource demands in terms of power together with the anticipated quality degradation. The optimization is carried out to determine the trade-off between these objectives. We demonstrate the effectiveness of our approach on a typical color processing pipeline by tailoring the encoding and genetic operations to the needs of this application. To focus the optimization into a relevant region of interest, we propose ROI-NSGA, a novel variant of nondominated solution selection, and compare its optimization efficiency with the traditional NSGA-II approach for the examined case study. Our results show that the models are able to guide the optimization, and that the genetic operations and selections are capable to find Pareto-optimal solutions, among which the desired quality-resource trade-off can be chosen. Besides, the ROI-NSGA based optimization outperforms the results obtained for the case study using the NSGA-II approach within the region of interest.

show abstract

Inexact Arithmetic Operators

Sekanina,

Vasicek,

Mrazek

2022

Approximate Computing Techniques

View full text Add to dashboard Cite

Area-Optimized Low-Latency Approximate Multipliers for FPGA-based Hardware Accelerators

Cited by 20 publications

References 5 publications

SIMDive: Approximate SIMD Soft Multiplier-Divider for FPGAs with Tunable Accuracy

SIMDive: Approximate SIMD Soft Multiplier-Divider for FPGAs with Tunable Accuracy

Region of Interest-Based Parameter Optimization for Approximate Image Processing on FPGAs

Inexact Arithmetic Operators

Contact Info

Product

Resources

About