Learning True Rate-Distortion-Optimization for End-To-End Image Compression

Brand, Fabian; Kristian, Fischer,; Kopte, Alexander; Schober, Robert

doi:10.48550/arxiv.2201.01586

Cited by 2 publications

(2 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…RDOnet deploys masking layers to zero-out certain coefficients. By training models with such layers, unimportant regions of the image are identified during inference and do not have their information transmitted [11]- [13]. In [14], an ROI-based multi-rate codec is proposed that can dynamically control local and global rate allocation at the frame-level.…”

Section: Introduction Conventional Video Compression Has Been Challen...mentioning

confidence: 99%

Spatial Rate Allocation for Learning-based Video Coding

Abdoli,

Henry,

Clare

et al. 2023

2023 31st European Signal Processing Conference (EUSIPCO)

View full text Add to dashboard Cite

This paper presents a method that enables arbitrary end-to-end Learning-based image/video codecs to apply spatial rate allocation. At the frame-level, the forward pass of the underlying encoder network is followed by a latent refinement step, in which a customized loss function is minimized. This loss function takes as input an arbitrary pixel-wise map that defines the interest of each pixel and computes a weighted distortion with respect to the given interest map. Back-propagation of the customized loss function using the gradient descent gives a refined version of the frame latent in which the quality of regions of interest (ROI) is improved at the cost of quality of regions of disinterest. The proposed method is implemented on top of an existing end-to-end LVC, called AIVC 1 , using saliencebased interest maps. Experiments show that the proposed method can effectively improve the quality of regions of interest frames. Notably, BD-BR performance using Weighted PSNR (WPSNR) shows an improvement of up to 21% by the proposed method.

show abstract

Section: Introduction Conventional Video Compression Has Been Challen...mentioning

confidence: 99%

Spatial Rate Allocation for Learning-based Video Coding

Abdoli,

Henry,

Clare

et al. 2023

2023 31st European Signal Processing Conference (EUSIPCO)

View full text Add to dashboard Cite

show abstract

“…For instance, RDOnet deploys masking layers to zero-out certain coefficients. By training models with such layers, unimportant regions of the image are identified during inference and do not have their information transmitted [7]- [9]. The drawback of this approach is the lack of inference-time signal adaptation.…”

Section: Introductionmentioning

confidence: 99%

GOP-Based Latent Refinement for Learned Video Coding

Abdoli¹,

Clare²,

Henry³

2023

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

This paper presents a method allowing learned video encoders to apply arbitrary latent refinement strategies to serve as Rate-Distortion Optimization (RDO) at the time of encoding. To do so, a latent domain search is applied on an initial latent representation of the video signal. This search is implemented as a set of iterations, each of which performs a gradient descent with back-propagation of error defined by a Lagrangian RD cost. This cost function is intentionally chosen to be the same as the cost function that was used during the end-to-end model training, except that instead of updating model weights, each iteration fine-tunes the latent representation itself. Moreover, a temporal look-ahead is integrated in the cost function of I and P frames to take into account the cascade effect of their latent fine-tuning on subsequent frames in the Group of Pictures (GOP). The experiments show that the proposed latent space RDO method can improve by 11.6% and 9.4% in terms of BD-BR coding efficiency in Random-Access (RA) and All-Intra (AI) configurations, when applied on top a high-performance opensource end-to-end codec.

show abstract

Learning True Rate-Distortion-Optimization for End-To-End Image Compression

Cited by 2 publications

References 10 publications

Spatial Rate Allocation for Learning-based Video Coding

Spatial Rate Allocation for Learning-based Video Coding

GOP-Based Latent Refinement for Learned Video Coding

Contact Info

Product

Resources

About