High-Throughput Variable-to-Fixed Entropy Codec Using Selective, Stochastic Code Forests

Torres, Manuel Martínez; Hernández-Cabronero, Miguel; Blanes, Ian; Serra-Sagristà, Joan

doi:10.1109/access.2020.2991314

Cited by 1 publication

(3 citation statements)

References 53 publications

(97 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Even with our architecture, learned image encoding and decoding is still a highly GPU-bound algorithm on most platforms. Therefore, as future work, by combining our method with other research to optimize the neural model, even higher performance could be achieved; on the other hand, the introduction of higher performance entropy coders such as [16] and [19] could potentially further reduce latency on desktop platforms. Especially for context-based models, combining our architecture with works such as [17] and [18] enables more parallel execution and higher throughput to be achieved.…”

Section: Discussionmentioning

confidence: 99%

“…On the other hand, works like [16] and [19] showed different approaches in implementing high-performance entropy coders. [17] and [18] demonstrated different methods enabling parallel entropy parameter calculations in context models, significantly boosting the serial masked CNN performance bottleneck.…”

Section: Related Workmentioning

confidence: 99%

“…Unlike previous works, our method works on most existing learned image codecs without either new neural model or entropy coder architectures. Our approach also comes with the capability to be further optimized when combined with the above model optimizations such as [7], [14], and [6], or highperformance entropy coders such as [16] and [19].…”

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Streaming-Capable High-Performance Architecture of Learned Image Compression Codecs

Lin

Sun

Katto

2022

2022 IEEE International Conference on Image Processing (ICIP)

View full text Add to dashboard Cite

Learned image compression allows achieving state-of-theart accuracy and compression ratios, but their relatively slow runtime performance limits their usage. While previous attempts on optimizing learned image codecs focused more on the neural model and entropy coding, we present an alternative method to improving the runtime performance of various learned image compression models. We introduce multi-threaded pipelining and an optimized memory model to enable GPU and CPU workloads' asynchronous execution, fully taking advantage of computational resources. Our architecture alone already produces excellent performance without any change to the neural model itself. We also demonstrate that combining our architecture with previous tweaks to the neural models can further improve runtime performance. We show that our implementations excel in throughput and latency compared to the baseline and demonstrate the performance of our implementations by creating a real-time video streaming encoder-decoder sample application, with the encoder running on an embedded device.

show abstract