Learned Image Compression With Discretized Gaussian Mixture Likelihoods and Attention Modules

Cheng, Zhengxue; Sun, Heming; Takeuchi, Masaru; Katto, Jiro

doi:10.1109/cvpr42600.2020.00796

Cited by 660 publications

(658 citation statements)

References 13 publications

Supporting

Mentioning

657

Contrasting

Order By: Relevance

“…Previous methods accumulated all the priors to estimate the probability based on a single GMM assumption for each element. Recent studies in [151] and [152] have shown that weighted GMMs can further improve coding efficiency.…”

Section: ) R-d Optimizationmentioning

confidence: 99%

Advances in Video Compression System Using Deep Neural Network: A Review and Case Studies

Ding

Chen

et al. 2021

Proc. IEEE

View full text Add to dashboard Cite

Significant advances in video compression systems have been made in the past several decades to satisfy the near-exponential growth of Internet-scale video traffic. From the application perspective, we have identified three major functional blocks, including preprocessing, coding, and postprocessing, which have been continuously investigated to maximize the end-user quality of experience (QoE) under a limited bit rate budget. Recently, artificial intelligence (AI)-powered techniques have shown great potential to further increase the efficiency of the aforementioned functional blocks, both individually and jointly. In this article, we review recent technical advances in video compression systems extensively, with an emphasis on deep neural network (DNN)based approaches, and then present three comprehensive case studies. On preprocessing, we show a switchable texturebased video coding example that leverages DNN-based scene understanding to extract semantic areas for the improvement Manuscript

show abstract

Section: ) R-d Optimizationmentioning

confidence: 99%

Advances in Video Compression System Using Deep Neural Network: A Review and Case Studies

Ding

Chen

et al. 2021

Proc. IEEE

View full text Add to dashboard Cite

show abstract

“…This section introduces two classes of solutions to support YUV 4:2:0 format. The first class of solutions are based on input-output channel alignment, which aim to support YUV 4:2:0 without introducing any major changes to the existing network architectures in the literature [9], [10], [11], [12], [13], [14]. On the other hand, second class of solutions proposes a new transform network architecture where the main goal is to compress YUV 4:2:0 input data more efficiently.…”

Section: Transform Network Architectures For Image/video Codingmentioning

confidence: 99%

“…However, in the literature, there is very little or no work on DLEC designs specialized for YUV sources. Although existing architectures designed for coding RGB data [9], [10], [11], [12], [13], [14] (such as the one shown in Fig. 4) can be employed to support non-subsampled YUV 4:4:4 format by simply retraining network parameters on a YUV 4:4:4 dataset, effective solutions for chroma subsampled formats, such as YUV 4:2:0, are non-trivial and require new neural network architectures.…”

Section: Introductionmentioning

confidence: 99%

Transform Network Architectures for Deep Learning Based End-to-End Image/Video Coding in Subsampled Color Spaces

Egilmez

Singh

Coban

et al. 2021

IEEE Open J. Signal Process.

View full text Add to dashboard Cite

Most of the existing deep learning based end-to-end image/video coding (DLEC) architectures are designed for non-subsampled RGB color format. However, in order to achieve a superior coding performance, many state-of-the-art block-based compression standards such as High Efficiency Video Coding (HEVC/H.265) and Versatile Video Coding (VVC/H.266) are designed primarily for YUV 4:2:0 format, where U and V components are subsampled by considering the human visual system. This paper investigates various DLEC designs to support YUV 4:2:0 format by comparing their performance against the main profiles of HEVC and VVC standards under a common evaluation framework. Moreover, a new transform network architecture is proposed to improve the efficiency of coding YUV 4:2:0 data. The experimental results on YUV 4:2:0 datasets show that the proposed architecture significantly outperforms naive extensions of existing architectures designed for RGB format and achieves about 10% average BD-rate improvement over the intra-frame coding in HEVC.

show abstract

“…al [31] customized an architecture based on neural network and wavelet trans-form capable to support both lossy and lossless compression schemes. Cheng et.al developed a flexible entropy model based on discretized Gaussian mixture likelihoods by taking the advantage of recent attention modules and is proved its efficiency in reducing the latency [32].…”

Section: Related Workmentioning

confidence: 99%

“…This shows that the suggested method can be a good replacement for the traditional algorithms in an application that requires a large amount of storage space such as Picture Archiving and Communication System(PACS). Part d, e,f of the Figure 6 shows the comparison between the Machine Learning based Compression algorithms -GMM & Attention [32], iWave++ [31], Non-Local 3D-Context [30]. The proposed method is compared to the existing machine learning method on the basis of PSNR, SSIM and Space Saving.…”

Section: Evaluation Of System Performancementioning

confidence: 99%

Fast Fractal Coding of MRI Images using Deep Reinforcement Learning

Varghese¹,

Krishnakumar²

2021

IJACSA

View full text Add to dashboard Cite

This paper presents an algorithm based on Fractal theory by using Iterated Function Systems (IFS). An efficient and fast coding mechanism is proposed by exploiting the self similarity nature in the Brain MRI images. The proposed algorithm utilizes Deep Reinforcement Learning (DRL) technique to learn the transformations required to recreate the original image.We avail of the Adaptive Iterated Function System (AIFS) as the encoding scheme. The proposed algorithm is trained and customised to compress the Medical images, especially Magnetic Resonance Imaging (MRI). The algorithm is tested and evaluated by using the original MR head scan test images. It learns from an existing biomedical dataset viz The Internet Brain Segmentation Repository (IBSR) to predict the new local affine transformations. The empirical analysis shows that the proposed algorithm is at least 4 times faster than the competitive methods and the decoding quality is far distinct with a reduction in the bit rate.

show abstract

Learned Image Compression With Discretized Gaussian Mixture Likelihoods and Attention Modules

Cited by 660 publications

References 13 publications

Advances in Video Compression System Using Deep Neural Network: A Review and Case Studies

Advances in Video Compression System Using Deep Neural Network: A Review and Case Studies

Transform Network Architectures for Deep Learning Based End-to-End Image/Video Coding in Subsampled Color Spaces

Fast Fractal Coding of MRI Images using Deep Reinforcement Learning

Contact Info

Product

Resources

About