Efficient Adaptation of Neural Network Filter for Video Compression

Lam, Yat-Hong; Zare, Alireza; Cricri, Francesco; Lainema, Jani; Hannuksela, Miska M.

doi:10.1145/3394171.3413536

Cited by 24 publications

(16 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…• QP-specific training: dedicating one model for each QP or a range of QPs [8,66]. • QP-map training: providing QP as an input to the network [12,27,33,50,57]. Each approach has benefits and drawbacks.…”

Section: Methods Based On Coding Informationmentioning

confidence: 99%

A CNN-Based Prediction-Aware Quality Enhancement Framework for VVC

Nasiri

Hamidouche

Morin

et al. 2021

IEEE Open J. Signal Process.

View full text Add to dashboard Cite

This paper presents a framework for Convolutional Neural Network (CNN)-based quality enhancement task, by taking advantage of coding information in the compressed video signal. The motivation is that normative decisions made by the encoder can significantly impact the type and strength of artifacts in the decoded images. In this paper, the main focus has been put on decisions defining the prediction signal in intra and inter frames. This information has been used in the training phase as well as input to help the process of learning artifacts that are specific to each coding type. Furthermore, to retain a low memory requirement for the proposed method, one model is used for all Quantization Parameters (QPs) with a QP-map, which is also shared between luma and chroma components. In addition to the Post Processing (PP) approach, the In-Loop Filtering (ILF) codec integration has also been considered, where the characteristics of the Group of Pictures (GoP) are taken into account to boost the performance. The proposed CNN-based Quality Enhancement (QE) framework has been implemented on top of the Versatile Video Coding (VVC) Test Model (VTM-10). Experiments show that the prediction-aware aspect of the proposed method improves the coding efficiency gain of the default CNN-based QE method by 1.52%, in terms of BD-BR, at the same network complexity compared to the default CNN-based QE filter.

show abstract

Section: Methods Based On Coding Informationmentioning

confidence: 99%

A CNN-Based Prediction-Aware Quality Enhancement Framework for VVC

Nasiri

Hamidouche

Morin

et al. 2021

IEEE Open J. Signal Process.

View full text Add to dashboard Cite

show abstract

“…This operation is performed on the encoder side, and the subject of optimization may be the encoder itself, the output of the encoder, or the decoder. In [9,10], the post-processing filter is finetuned at decoder side by using weight-updates signaled from the encoder to the decoder at inference time. Techniques of latent tensor overfitting for better human consumption are presented in [11,12], aiming at reducing the distortion in the pixel domain.…”

Section: Related Workmentioning

confidence: 99%

Learned Image Coding for Machines: A Content-Adaptive Approach

Zhang

Cricri

et al. 2021

2021 IEEE International Conference on Multimedia and Expo (ICME)

Self Cite

View full text Add to dashboard Cite

Today, according to the Cisco Annual Internet Report (2018)(2019)(2020)(2021)(2022)(2023), the fastest-growing category of Internet traffic is machine-to-machine communication. In particular, machineto-machine communication of images and videos represents a new challenge and opens up new perspectives in the context of data compression. One possible solution approach consists of adapting current human-targeted image and video coding standards to the use case of machine consumption. Another approach consists of developing completely new compression paradigms and architectures for machine-to-machine communications. In this paper, we focus on image compression and present an inference-time content-adaptive finetuning scheme that optimizes the latent representation of an end-to-end learned image codec, aimed at improving the compression efficiency for machine-consumption. The conducted experiments targeting instance segmentation task network show that our online finetuning brings an average bitrate saving (BD-rate) of -3.66% with respect to our pretrained image codec. In particular, at low bitrate points, our proposed method results in a significant bitrate saving of -9.85%. Overall, our pretrained-and-then-finetuned system achieves -30.54% BD-rate over the state-of-the-art image/video codec Versatile Video Coding (VVC) on instance segmentation.

show abstract

“…However, the single-path model is hard to utilize spatially-precise representations and large receptive field simultaneous. With respect to the inference of neural network for video filtering, frame-level on/off control was investigated in [6] and an efficient finetuning methodology was proposed to adapt the neural network to the specific content [14]. Attention mechanism.…”

Section: Related Workmentioning

confidence: 99%

Multi-Density Attention Network for Loop Filtering in Video Compression

Wang,

Ma,

2021

Preprint

View full text Add to dashboard Cite

Video compression is a basic requirement for consumer and professional video applications alike. Video coding standards such as H.264/AVC and H.265/HEVC are widely deployed in the market to enable efficient use of bandwidth and storage for many video applications. To reduce the coding artifacts and improve the compression efficiency, neural network based loop filtering of the reconstructed video has been developed in the literature. However, loop filtering is a challenging task due to the variation in video content and sampling densities. In this paper, we propose a on-line scaling based multi-density attention network for loop filtering in video compression. The core of our approach lies in several aspects: (a) parallel multi-resolution convolution streams for extracting multi-density features, (b) single attention branch to learn the sample correlations and generate mask maps, (c) a channel-mutual attention procedure to fuse the data from multiple branches, (d) on-line scaling technique to further optimize the output results of network according to the actual signal. The proposed multi-density attention network learns rich features from multiple sampling densities and performs robustly on video content of different resolutions. Moreover, the online scaling process enhances the signal adaptability of the off-line pre-trained model. Experimental results show that 10.18% bit-rate reduction at the same video quality can be achieved over the latest Versatile Video Coding (VVC) standard. The objective performance of the proposed algorithm outperforms the state-of-the-art methods and the subjective quality improvement is obvious in terms of detail preservation and artifact alleviation. CCS CONCEPTS• Computing methodologies → Image compression.

show abstract

Efficient Adaptation of Neural Network Filter for Video Compression

Cited by 24 publications

References 11 publications

A CNN-Based Prediction-Aware Quality Enhancement Framework for VVC

A CNN-Based Prediction-Aware Quality Enhancement Framework for VVC

Learned Image Coding for Machines: A Content-Adaptive Approach

Multi-Density Attention Network for Loop Filtering in Video Compression

Contact Info

Product

Resources

About