Spatial resolution adaptation is a technique which has often been employed in video compression to enhance coding efficiency. This approach encodes a lower resolution version of the input video and reconstructs the original resolution during decoding. Instead of using conventional up-sampling filters, recent work has employed advanced super-resolution methods based on convolutional neural networks (CNNs) to further improve reconstruction quality. These approaches are usually trained to minimise pixel-based losses such as Mean-Squared Error (MSE), despite the fact that this type of loss metric does not correlate well with subjective opinions. In this paper, a perceptually-inspired super-resolution approach (M-SRGAN) is proposed for spatial up-sampling of compressed video using a modified CNN model, which has been trained using a generative adversarial network (GAN) on compressed content with perceptual loss functions. The proposed method was integrated with HEVC HM 16.20, and has been evaluated on the JVET Common Test Conditions (UHD test sequences) using the Random Access configuration. The results show evident perceptual quality improvement over the original HM 16.20, with an average bitrate saving of 35.6% (Bjøntegaard Delta measurement) based on a perceptual quality metric, VMAF.
This paper presents a novel Convolutional Neural Network (CNN) based effective bit depth adaptation approach (EBDA-CNN) for video compression. It applies effective bit depth down-sampling before encoding and reconstructs the original bit depth using a deep CNN based up-sampling method at the decoder. The proposed approach has been integrated with the High Efficiency Video Coding reference software HM 16.20, and evaluated under the Joint Video Exploration Team Common Test Conditions using the Random Access configuration. The results show consistent coding gains on all tested sequences, with an average bitrate saving of 6.4%, based on Bjøntegaard Delta measurements using PSNR.
Encoding spatio-temporally varying textures is challenging for standardised video encoders, with significantly more bits required for textured blocks compared to non-textured blocks. It is therefore beneficial to understand video textures in terms of both their spatio-temporal characteristics and their encoding statistics in order to optimize coding modes and performance. To this end, we examine the classification of video texture based on encoder performance. For this purpose, we employ spatio-temporal features and follow a twostep feature selection process by employing unsupervised machine learning approaches across the selected feature space. Finally, supervised machine learning approaches are applied on the set of the selected features that support classification prior to encoding with up to 95.1% accuracy. The results of this study will form the basis of a new informed approach to codec configuration and mode selection in both current and future encoders.
In this paper we compare the performance of two state-of-the-art competing codecs, AV1 and HEVC, in the context of adaptive streaming. We specifically consider a Dynamic Optimizer (DO) methodology that is content-aware and selects the resolution of the video sequence after constructing the convex hull of the Rate-Quality curves of all considered resolutions. We start with an objective evaluation of the Dynamic Optimizer, based on both PSNR and VMAF quality metrics. The Rate-VMAF curves show an average of 6.3% BD-Rate gain of AV1 over HEVC, while the Rate-PSNR curves an show an average BD-Rate loss of 1.8%. We then report subjective tests which evaluate the perceived quality of the selected bitstreams generated by the two codecs. In this case it was found that, for most rate points, the difference in the perceived quality between HEVC and AV1 is not significant.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.