Proceedings of the 2014 Conference on Design and Architectures for Signal and Image Processing 2014
DOI: 10.1109/dasip.2014.7115637
|View full text |Cite
|
Sign up to set email alerts
|

CUVLE: Variable-length encoding on CUDA

Abstract: Abstract-Data compression is the process of representing information in a compact form, in order to reduce the storage requirements and, hence, communication bandwidth. It has been one of the critical enabling technologies for the ongoing digital multimedia revolution for decades. In the variable-length encoding (VLE) compression method, most frequently occurring symbols are replaced by codes with shorter lengths. As it is a common strategy in many compression applications, efficient parallel implementations o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 6 publications
0
5
0
Order By: Relevance
“…Since the prefix‐sums can be computed efficiently in parallel, Huffman encoding can also be done in parallel. Several GPU implementations for Huffman encoding using this idea have been presented 12,32 . On the other hand, Huffman decoding is very hard to parallelize, because codeword sequence Y$$ Y $$ has no separator and each codeword cannot be identified without reading bits ahead of it.…”
Section: Deflate Encoding and Decodingmentioning
confidence: 99%
See 1 more Smart Citation
“…Since the prefix‐sums can be computed efficiently in parallel, Huffman encoding can also be done in parallel. Several GPU implementations for Huffman encoding using this idea have been presented 12,32 . On the other hand, Huffman decoding is very hard to parallelize, because codeword sequence Y$$ Y $$ has no separator and each codeword cannot be identified without reading bits ahead of it.…”
Section: Deflate Encoding and Decodingmentioning
confidence: 99%
“…Several GPU implementations for Huffman encoding using this idea have been presented. 12,32 On the other hand, Huffman decoding is very hard to parallelize, because codeword sequence Y has no separator and each codeword cannot be identified without reading bits ahead of it. Hence, a parallel divide-and-conquer approach that decodes Y from the middle of Y does not work.…”
Section: 2mentioning
confidence: 99%
“…As in our previous works [7,8], the thread-block synchronization mechanism proposed by Yan et al [47] is used for synchronizing the reads with the writes in global memory. In this case, it is applied on both horizontal (d_info_A) and vertical (d_info_B) dimensions and the reads are performed using atomic operations.…”
Section: Fig 14 Transmission Of Parameter Nb Through Global Memorymentioning
confidence: 99%
“…Fuentes-Alventosa et al [47] proposed a GPU implementation of Huffman coding using CUDA with a given table of variablelength codes, which improves the performance by more than 20× compared with a serial CPU implementation. Rahmani et al [48] proposed a CUDA implementation of Huffman coding based on serially constructing the Huffman codeword tree and parallel generating the byte stream, which can achieve up to 22× speedups compared with a serial CPU implementation without any constraint on the maximum codeword length or data entropy.…”
Section: Huffman Coding On Gpumentioning
confidence: 99%