2013
DOI: 10.1002/cpe.3093
|View full text |Cite
|
Sign up to set email alerts
|

Efficient parallel implementation of three‐point viterbi decoding algorithm on CPU, GPU, and FPGA

Abstract: SUMMARYIn wireless communication, Viterbi decoding algorithm (VDA) is the one of most popular channel decoding algorithms, which is widely used in WLAN, WiMAX, or 3G communications. However, the throughput of Viterbi decoder is constrained by the convolutional characteristic. Recently, the three‐point VDA (TVDA) was proposed to solve this problem. In TVDA, the whole procedure can be divided into three phases, the forward, trace‐back, and decoding phases. In this paper, we analyze the parallelism of TVDA and pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

1
11
0

Year Published

2015
2015
2019
2019

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 18 publications
(12 citation statements)
references
References 19 publications
(78 reference statements)
1
11
0
Order By: Relevance
“…However, our results confirm that the common comparison between serial CPU implementations and GPU implementations is quite misleading (e.g., ): In such scenarios, very large speedups in favor of GPUs are achieved; however, the GPU advantage either disappears completely as soon as the full CPU capacities are utilized or at least becomes considerably smaller (speedups in the single–digit range). This is what we observe for the min‐warping algorithm as well with speedups between 2 and 8.4 in favor of GPUs compared with multi‐core‐SIMD.…”
Section: Discussionsupporting
confidence: 68%
See 3 more Smart Citations
“…However, our results confirm that the common comparison between serial CPU implementations and GPU implementations is quite misleading (e.g., ): In such scenarios, very large speedups in favor of GPUs are achieved; however, the GPU advantage either disappears completely as soon as the full CPU capacities are utilized or at least becomes considerably smaller (speedups in the single–digit range). This is what we observe for the min‐warping algorithm as well with speedups between 2 and 8.4 in favor of GPUs compared with multi‐core‐SIMD.…”
Section: Discussionsupporting
confidence: 68%
“…There exists a considerable amount of studies in which the performance of CPU and GPU implementations is compared for specific tasks (for example, ). Even closer to our work are studies which include FPGAs or directly compare FPGAs with GPUs . The benchmarked applications are from many different fields like machine learning , neural modeling , optimization , numerical algorithms , image and video processing [16, 22, 24–27, 30], computer tomography , molecular sequencing , financial simulations , encryption and decoding , or analog circuit simulation .…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Although the use of graphics processing units (GPUs) is now de rigeur in applications of neural networks and made easy through toolkits like Theano (Theano Development Team, 2016), there has been little previous work, to our knowledge, on acceleration of weighted finite-state computations on GPUs (Narasiman et al, 2011;Li et al, 2014;Peng et al, 2016;Chong et al, 2009). In this paper, we consider the operations that are most likely to have high speed requirements: decoding using the Viterbi algorithm, and training using the forward-backward algorithm.…”
Section: Introductionmentioning
confidence: 99%