An algebraic integer (AI) based time-multiplexed row-parallel architecture and two final-reconstruction step (FRS) algorithms are proposed for the implementation of bivariate AI-encoded 2-D discrete cosine transform (DCT). The architecture directly realizes an error-free 2-D DCT without using FRSs between row-column transforms, leading to an 8×8 2-D DCT which is entirely free of quantization errors in AI basis. As a result, the user-selectable accuracy for each of the coefficients in the FRS facilitates each of the 64 coefficients to have its precision set independently of others, avoiding the leakage of quantization noise between channels as is the case for published DCT designs. The proposed FRS uses two approaches based on (i) optimized Dempster-Macleod multipliers and (ii) expansion factor scaling. This architecture enables low-noise high-dynamic range applications in digital video processing that requires full control of the finite-precision computation of the 2-D DCT. The proposed architectures and FRS techniques are experimentally verified and validated using hardware implementations that are physically realized and verified on FPGA chip. Six designs, for 4-and 8-bit input word sizes, using the two proposed FRS schemes, have been designed, simulated, physically implemented and measured. The maximum clock rate and block-rate achieved among 8-bit input designs are 307.787 MHz and 38.47 MHz, respectively, implying a pixel rate of 8×307.787≈2.462 GHz if eventually embedded in a real-time video-processing system. The equivalent frame rate is about 1187.35 Hz for the image size of 1920×1080. All implementations are functional on a Xilinx Virtex-6 XC6VLX240T FPGA device.
Multi-beamforming is an important requirement for broadband space imaging applications based on dense aperture arrays (AAs). Usually, the discrete Fourier transform is the transform of choice for AA electromagnetic imaging. Here, the discrete cosine transform (DCT) is proposed as an alternative, enabling the use of emerging fast algorithms that offer greatly reduced complexity in digital arithmetic circuits. We propose two novel high-speed digital architectures for recently proposed fast algorithms (Bouguezel, Ahmad and Swamy 2008 Electron. Lett. 44 1249–50) (BAS-2008) and (Cintra and Bayer 2011 IEEE Signal Process. Lett. 18 579–82) (CB-2011) that provide good approximations to the DCT at zero multiplicative complexity. Further, we propose a novel DCT approximation having zero multiplicative complexity that is shown to be better for multi-beamforming AAs when compared to BAS-2008 and CB-2011. The far-field array pattern of ideal DCT, BAS-2008, CB-2011 and proposed approximation are investigated with error analysis. Extensive hardware realizations, implementation details and performance metrics are provided for synchronous field programmable gate array (FPGA) technology from Xilinx. The resource consumption and speed metrics of BAS-2008, CB-2011 and the proposed approximation are investigated as functions of system word size. The 8-bit versions are mapped to emerging asynchronous FPGAs leading to significantly increased real-time throughput with clock rates at up to 925.6 MHz implying the fastest DCT approximations using reconfigurable logic devices in the literature.
The usage of linear transformations has great relevance for data decorrelation applications, like image and video compression. In that sense, the discrete Tchebichef transform (DTT) possesses useful coding and decorrelation properties. The DTT transform kernel does not depend on the input data and fast algorithms can be developed to real time applications. However, the DTT fast algorithm presented in literature possess high computational complexity. In this work, we introduce a new low-complexity approximation for the DTT. The fast algorithm of the proposed transform is multiplication-free and requires a reduced number of additions and bit-shifting operations. Image and video compression simulations in popular standards shows good performance of the proposed transform. Regarding hardware resource consumption for FPGA shows 43.1% reduction of configurable logic blocks and ASIC place and route realization shows 57.7% reduction in the area-time figure when compared with the 2-D version of the exact DTT.
KeywordsApproximate transforms, discrete Tchebichef transform, fast algorithms, image and video coding * Paulo A. M. Oliveira is with the Signal Processing Group,
Multiple independent radio frequency (RF) beams find applications in communications, radio astronomy, radar, and microwave imaging. An N -point FFT applied spatially across an array of receiver antennas provides N -independent RF beams at N 2 log 2 N multiplier complexity. Here, a low-complexity multiplierless approximation for the 8-point FFT is presented for RF beamforming, using only 26 additions. The algorithm provides eight beams that closely resemble the antenna array patterns of the traditional FFT-based beamformer albeit without using multipliers. The proposed FFT-like algorithm is useful for low-power RF multi-beam receivers; being synthesized in 45 nm CMOS technology at 1.1 V supply, and verified on-chip using a Xilinx Virtex-6 Lx240T FPGA device. The CMOS simulation and FPGA implementation indicate bandwidths of 588 MHz and 369 MHz, respectively, for each of the independent receive-mode RF beams.
A continuous-time (CT) radio frequency (RF) antenna array beamformer and analog circuit based on a discrete-space-continuous-time (DSCT) 2-D fan-filter having transfer function is derived. The proposed transfer function is based on a 2-D FIR discrete domain fan filter. The discrete domain prototype is converted to the proposed mixed-domain DSCT analog filter by replacing unit sampled delays with CT analog first-order all-pass networks corresponding to the bilinear transform. First-order all-pass network is a poor approximation to a CT delay . To address this, a novel broadband pre-warping method is proposed to exactly compensate for such "bilinear warping". A 65 nm CMOS VLSI circuit for is proposed and an example fan filter with axis oriented at , half-fan-angle and maximum frequency is simulated employing closed-form expressions, an ABCD parameter based model and 65 nm CMOS simulations in Cadence. A stop-band interference rejection of 38 dB is verified by BSIM4 based simulations. The proposed circuit for operates at 3.7 mA from a 1.2 V supply. The beamfomer is shown to operate correctly in the presence of PVT variations of .Index Terms-Analog beamforming, CMOS, delay approximation, fan filter, VLSI.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.