Fahad Qureshi scite author profile

Abstract-This paper presents a new approach to design multiplierless constant rotators. The approach is based on a combined coefficient selection and shift-and-add implementation (CCSSI) for the design of the rotators. First, complete freedom is given to the selection of the coefficients, i.e., no constraints to the coefficients are set in advance and all the alternatives are taken into account. Second, the shift-and-add implementation uses advanced single constant multiplication (SCM) and multiple constant multiplication (MCM) techniques that lead to lowcomplexity multiplierless implementations. Third, the design of the rotators is done by a joint optimization of the coefficient selection and shift-and-add implementation. As a result, the CCSSI provides an extended design space that offers a larger number of alternatives with respect to previous works. Furthermore, the design space is explored in a simple and efficient way.The proposed approach has wide applications in numerous hardware scenarios. This includes rotations by single or multiple angles, rotators in single or multiple branches, and different scaling of the outputs.Experimental results for various scenarios are provided. In all of them, the proposed approach achieves significant improvements with respect to state of the art.

show abstract

Low-complexity reconfigurable complex constant multiplication for FFTs

Qureshi

Gustafsson

2009

View full text Add to dashboard Cite

Efficient FPGA Mapping of Pipeline SDF FFT Cores

Ingemarsson

Källström

Qureshi

et al. 2017

IEEE Trans. VLSI Syst.

View full text Add to dashboard Cite

Generation of all radix-2 fast Fourier transform algorithms using binary trees

Qureshi

Gustafsson

2011

View full text Add to dashboard Cite

Hardware architectures for the fast Fourier transform

Garrido

Qureshi

Takala

et al. 2018

View full text Add to dashboard Cite

Addition Aware Quantization for Low Complexity and High Precision Constant Multiplication

Gustafsson

Qureshi

2010

IEEE Signal Process. Lett.

View full text Add to dashboard Cite

Abstract-Multiplication by constants can be efficiently realized using shifts, additions, and subtractions. In this work we consider how to select a fixed-point value for a real valued, rational, or floating-point coefficient to obtain a low-complexity realization. It is shown that the process, denoted addition aware quantization, often can determine coefficients that has as low complexity as the rounded value, but with a smaller approximation error by searching among coefficients with a longer wordlength.

show abstract

Multiplierless Unity-Gain SDF FFTs

Garrido

Andersson

Qureshi

et al. 2016

IEEE Trans. VLSI Syst.

View full text Add to dashboard Cite

Abstract-In this paper we propose a novel approach to implement multiplierless unity-gain SDF FFTs. Previous methods achieve unity-gain FFTs by using either complex multipliers or non-unity-gain rotators with additional scaling compensation. Conversely, this paper proposes unity-gain FFTs without compensation circuits, even when using non-unity-gain rotators. This is achieved by a joint design of rotators so that the entire FFT is scaled by a power of two, which is then shifted to unity. This reduces the amount of hardware resources of the FFT architecture, while having high accuracy in the calculations. The proposed approach can be applied to any FFT size and various designs for different FFT sizes are presented.

show abstract

Unified architecture for 2, 3, 4, 5, and 7‐point DFTs based on Winograd Fourier transform algorithm

2013

View full text Add to dashboard Cite

In this letter, a unified hardware architecture that can be reconfigured to calculate 2, 3, 4, 5, or 7-point DFTs is presented. The architecture is based on the Winograd Fourier transform algorithm (WFTA) and the complexity is equal to a 7-point DFT in terms of adders/subtracters and multipliers plus only seven multiplexers introduced to enable reconfigurability. The processing element finds potential use in memory-based FFTs, where nonpower-of-two sizes are required such as in DMB-T. Introduction:The discrete Fourier transform (DFT) is an important algorithm in the field of digital signal processing. It transforms a signal from the time domain into the frequency domain, providing information about the spectrum of the signal. The direct computation of an N -point DFT requires to calculate a number of operations proportional to N 2 . In order to reduce the number of arithmetic operations, many fast algorithms have been proposed, such as Cooley-Tukey [1], prime factor (PFA) [2] and Winograd Fourier transform (WFTA) [3] algorithms. Here, we refer to them collectively as fast Fourier transform (FFT) algorithms. These algorithms are based on decomposing an N -point DFT recursively into smaller DFTs, leading to a reduction of the computational complexity [4].Most FFT algorithms and architectures have focused on power-of-two size DFTs. However, recently the interest in non-power-of-two size DFTs has increased, mainly motivated by the 3780-point DFT in Chinese digital TV (DMB-T) [5,6] based on orthogonal frequency-division multiplexing (OFDM). In the receiving side of OFDM systems, an inverse DFT (IDFT) is usually required, which is easily computed using a DFT processor.Most FFT architectures are not well optimised for the computation of non-power-two-point FFTs, which make use of small point DFTs with varying sizes, as well as more complex data management. Some pipelined architectures for the 3780-point DFT in DMB-T have been proposed [5,6]. However, the streaming nature of a pipelined architecture leads to the fact that it can often process data at a much higher rate compared to the required 7.56 Mb/s. Hence, the amount of computational resources are often excessive. In [7], individual processing elements for 3 and 5-point DFTs was proposed and considered for a pipelined architecture. However, they were not based on the WFTA and have a slightly higher complexity.Memory-based FFTs are often more suitable for low data rate applications (where the clock frequency offered by the implementation technology is higher than the data rate), as they allow reusing the computational resources to a higher degree [8]. For a non-power-oftwo memory-based FFT, a number of challenges remain. One is how to carry out the more complex data management to interconnect the small DFTs. Another one is to develop a processing element that is suitable for computing small point DFTs of different sizes. This letter presents a unified architecture to compute the 2, 3, 4, 5, and 7-point DFTs by a single processing element. This architecture can be us...

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Fahad Qureshi

Low-Complexity Multiplierless Constant Rotators Based on Combined Coefficient Selection and Shift-and-Add Implementation (CCSSI)

Low-complexity reconfigurable complex constant multiplication for FFTs

Efficient FPGA Mapping of Pipeline SDF FFT Cores

Generation of all radix-2 fast Fourier transform algorithms using binary trees

Hardware architectures for the fast Fourier transform

Addition Aware Quantization for Low Complexity and High Precision Constant Multiplication

Multiplierless Unity-Gain SDF FFTs

Unified architecture for 2, 3, 4, 5, and 7‐point DFTs based on Winograd Fourier transform algorithm

Contact Info

Product

Resources

About