Conflict-free parallel memory access scheme for FFT processors

Takala, Jarmo; Jarvinen, T.; Sorokin, Harri

doi:10.1109/iscas.2003.1205957

Cited by 31 publications

(32 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Conventionally, complex multipliers and adders constitute the butterfly unit and is the major speed impediment of FFT processors [4]. Though the VLSI multiplier structure can efficiently perform the complex multiplication, they are not much efficient in case of trigonometrical operations.…”

Section: Introductionmentioning

confidence: 99%

FPGA Based Memory-Less Phase Generation for Butterfly Operation of CORDIC FFT Processor

Shome¹,

Giri²,

Datta³

2013

IJCEE

View full text Add to dashboard Cite

Abstract-Efficient implementation of Fast FourierTransform (FFT) in the hardware paradigm has been a major challenge for design engineers. Twiddle Factor generation and complex multiplication thereafter are the decisive steps of VLSI implementation of FFT. Conventional FFT analyzers call for a dedicated memory bank to store the twiddle factor angles in a predefined order. This storage results in a increased resource utilization which increases with N, the length of the Fourier Transform. This study presents a phase generation scheme that generates the necessary twiddle factor angles with simple hardware logic, depending on the present step and stage of FFT. This relinquishes the use of memory storage elements. Use of CORDIC to carry out complex multiplication further enhances system throughput. The present logic has been synthesized in Spartan 3E FPGA. The timing diagram results match the theoretical analysis and the synthesis report supports minimal hardware resource utilization.

show abstract

Section: Introductionmentioning

confidence: 99%

FPGA Based Memory-Less Phase Generation for Butterfly Operation of CORDIC FFT Processor

Shome¹,

Giri²,

Datta³

2013

IJCEE

View full text Add to dashboard Cite

show abstract

“…1. If large amount of data must be stored, e.g., in long FFTs, memorybased structures [1][2][3] are attractive. For relatively small storage requirements, however, register-based structures are better alternatives, and thus, they are considered in this paper.…”

Section: Introductionmentioning

confidence: 99%

Stride Permutation Networks for Array Processors

Jarvinen

Salmela

Sorokin

et al. 2007

J VLSI Sign Process Syst Sign Im

View full text Add to dashboard Cite

In several digital signal processing algorithms, computational nodes are organized in consecutive stages and data is reordered between these stages. Parallel computation of such algorithms with reduced number of processing elements implies that several computational nodes are assigned to each element. As a drawback, permutations become more complex and require data storage. In this paper, a systematic design methodology for stride permutation networks is derived. These permutations are represented with Boolean matrices, which are decomposed and mapped directly onto register-based networks. The resulting networks are regular and scalable and they support any stride of power-of-two. In addition, the networks reach the lower bound in the number of registers indicating area-efficiency. Since the proposed methodology is systematic, it can be exploited in automated design generation.

show abstract

“…To test the proposed parallel memory, the multi-port 2048 × 32 bits data memory was replaced with the parallel memory logic and two 1024 × 32-bit single-port MMs. A general form of the used storage scheme for FFT processors was presented in [9]. In our case N = 2 and the scheme reduces to a parity bit computation of an address i k from the LSU k .…”

Section: Methodsmentioning

confidence: 99%

“…The size of a single-port MM was kept constant in 1024 × 32 bits. The related S(i) and a(i) were derived from the FFT storage scheme in [9]. Note that the dual-port memory of size 2048 × 32 in Table 1 requires an area higher by a factor of 1.64 than the parallel memory of size 4096 × 32 with four ports.…”

Section: Methodsmentioning

confidence: 99%

“…Often, the designs which consider conflict resolving multi-port memory architecture, employ some form of a simple low-order interleaving scheme [2,4,6,8]. On the other hand, new storage scheme proposals, like the one in [9] employed in this paper, concentrate to conflict-free, complex memory storage and rarely consider conflict resolving support. This paper presents a hardware in detail for dynamic conflict resolving.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Parallel Memory Architecture for TTA Processor

Tanskanen

Pitkänen

Makinen³

et al.

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract.A conflict resolving parallel data memory system for Transport Triggered Architecture (TTA) is described. The architecture is generic and reusable to support various application specific designs. With parallel memory, more area and power consuming multi-port memory can be replaced with single-port memory modules. Number of ports can be increased over what is available on a design library for multi-port memories. In an FFT TTA example, dual-port data memory was replaced by the proposed architecture. To avoid memory conflicts, the original code was rescheduled and the TTA core was regenerated for the new schedule. The original memory required an area higher by a factor of 3.38 and energy higher by a factor of 1.70. In this case, the energy consumption of the processor core increased so that system energy consumption remained about the same. However, the original system required an area higher by a factor of 1.89.

show abstract

Conflict-free parallel memory access scheme for FFT processors

Cited by 31 publications

References 8 publications

FPGA Based Memory-Less Phase Generation for Butterfly Operation of CORDIC FFT Processor

FPGA Based Memory-Less Phase Generation for Butterfly Operation of CORDIC FFT Processor

Stride Permutation Networks for Array Processors

Parallel Memory Architecture for TTA Processor

Contact Info

Product

Resources

About