Applying a joint code and decoder design methodology, we develop a high-speed (3, k)-regular LDPC code partly parallel decoder architecture, based on which a 9216-bit, rate-1/2 (3, 6)-regular LDPC code decoder is implemented on Xilinx FPGA device. When performing maximum 18 iterations for each code block decoding, this partly parallel decoder supports a maximum symbol throughput of 54 Mbps and achieves BER 10 −6 at 2dB over AWGN channel.
INTRODUCTIONThanks to its excellent performance, Low-Density ParityCheck (LDPC) code [1][2] has been widely considered as a next-generation error-correcting code for telecommunication and magnetic storage. Defined as the null space of a very sparse M × N parity check matrix H, an LDPC code is typically represented by a bipartite graph, called Tanner graph, in which one set of N variable nodes corresponds to the set of codeword, another set of M check nodes corresponds to the set of parity check constraints and each edge corresponds to a non-zero entry in the parity check matrix H. An LDPC code is known as (j, k)-regular LDPC code if each column and each row in its parity check matrix have j and k non-zero entries, respectively. The construction of LDPC code is typically random. As illustrated in Fig. 1, LDPC code is decoded by the iterative belief-propagation (BP) algorithm [2] that directly matches its Tanner graph. A fully parallel decoder is realized by directly instantiating the BP decoding algorithm to hardware. Such fully parallel decoder could achieve extremely high decoding speed, e.g., a 1024-bit, rate-1/2 LDPC code fully parallel decoder [4] with the maximum symbol throughput of 1 Gbit/s has been implemented using ASIC technology. However, the primary disadvantage of fully parallel design is that with the increase of code length the hardware complexity will become more and more prohibitive for many practical purposes, e.g., the ASIC LDPC decoder [4] with only 1K-bit code length consumes 1.7M gates. Moreover, as pointed out in [4], the routing overhead is quite formidable due to the large code length and randomness of the Tanner graph.A joint code and decoder design methodology [5] was recently proposed for (3, k)-regular LDPC code and partly parallel decoder design to achieve appropriate trade-offs between hardware complexity and decoding throughput. In this paper, applying the proposed joint design methodology, we develop an elaborate (3, k)-regular LDPC code highspeed partly parallel decoder architecture based on which we implement a 9216-bit, rate-1/2 (3, 6)-regular LDPC code decoder using Xilinx Virtex FPGA device. We significantly modify the original decoder structure [5] to improve the decoding throughput and simplify the control logic design. We propose a novel concatenated scheme to realize the random connectivity by using two concatenated routing networks, where the random hardwire routings are localized to significantly reduce the routing overhead. Based on the post-routing static timing analysis, with the maximum 18 decoding iterations, this decoder ...