Ultra High Definition TV (UHDTV) services are being trialled while UHD streaming services have already seen commercial débuts. The amount of data associated with these new services is very high thus extremely efficient video compression tools are required for delivery to the end user. The recently published High Efficiency Video Coding (HEVC) standard promises a new level of compression efficiency, up to 50% better than its predecessor, Advanced Video Coding (AVC). The greater efficiency in HEVC is obtained at much greater computational cost compared to AVC. A practical encoder must optimise the choice of coding tools and devise strategies to reduce the complexity without affecting the compression efficiency. This paper describes the results of a study aimed at optimising HEVC encoding for UHDTV content. The study first reviews the available HEVC coding tools to identify the best configuration before developing three new algorithms to further reduce the computational cost. The proposed optimisations can provide an additional 11.5% encoder speed-up for an average 3.1% bitrate increase on top of the best encoder configuration.
Inter-prediction based on block-based motion estimation (ME) is used in most video codecs. The closer the prediction to the target block, the lower the residual, and thus more efficient compression can be achieved. In this paper, a new technique called enhanced inter-prediction (EIP) is proposed to improve the prediction candidates using an additional transformation acting while performing ME. A parametric transformation acts within the coding loop of each block to modify the prediction for each motion vector candidate. The EIP is validated in the particular case of a single-parameter shifting transformation. This paper presents an efficient algorithm to compute the best shift for each prediction candidate and a model to select the optimal prediction based on minimum cost integrating the approach with existing rate-distortion optimization techniques in the H.264/AVC video codec. Results show significant improvements with an average of 6% bit-rate reduction compared to the original H.264/AVC. Index Terms-H.264/AVC, inter-prediction, video coding.
Deep learning has shown great potential in image and video compression tasks. However, it brings bit savings at the cost of significant increases in coding complexity, which limits its potential for implementation within practical applications. In this paper, a novel neural network-based tool is presented which improves the interpolation of reference samples needed for fractional precision motion compensation. Contrary to previous efforts, the proposed approach focuses on complexity reduction achieved by interpreting the interpolation filters learned by the networks. When the approach is implemented in the Versatile Video Coding (VVC) test model, up to 4.5% BD-rate saving for individual sequences is achieved compared with the baseline VVC, while the complexity of learned interpolation is significantly reduced compared to the application of full neural network.
The flexible partitioning scheme and increased number of prediction modes in the High Efficiency Video Coding (HEVC) standard are largely responsible for both its high compression efficiency and computational complexity. In typical HEVC encoder implementations, Coding Units (CUs) in a Coding Tree Unit (CTU) are visited from top to bottom at each level of recursion to select the optimal coding configuration. In this paper, a novel approach is presented in which CUs in a CTU can be adaptively visited also in a reverse bottom to top visiting order. This Reverse CU (RCU) visiting order allows for different algorithmic optimizations for further complexity reduction of many HEVC encoding steps, especially under challenging conditions, such as highly textured or fast moving content. In particular, algorithms to reduce complexity of HEVC depth selection, mode decision and inter-prediction are presented here based on the coding information obtained from higher depths when using the RCU visiting order. Experimental results show that enabling different stages of the proposed algorithm can achieve average speed-ups from 16.3% to 36.6% compared to fast reference HEVC implementation with pre-built speed-ups enabled (up to 51.2% in some cases), for 0.3% to 2.2% BD-rate penalty.
Neural networks can be successfully used to improve several modules of advanced video coding schemes. In particular, compression of colour components was shown to greatly benefit from usage of machine learning models, thanks to the design of appropriate attention-based architectures that allow the prediction to exploit specific samples in the reference region. However, such architectures tend to be complex and computationally intense, and may be difficult to deploy in a practical video coding pipeline. This work focuses on reducing the complexity of such methodologies, to design a set of simplified and cost-effective attention-based architectures for chroma intra-prediction. A novel size-agnostic multi-model approach is proposed to reduce the complexity of the inference process. The resulting simplified architecture is still capable of outperforming state-of-the-art methods. Moreover, a collection of simplifications is presented in this paper, to further reduce the complexity overhead of the proposed prediction architecture. Thanks to these simplifications, a reduction in the number of parameters of around 90% is achieved with respect to the original attentionbased methodologies. Simplifications include a framework for reducing the overhead of the convolutional operations, a simplified cross-component processing model integrated into the original architecture, and a methodology to perform integer-precision approximations with the aim to obtain fast and hardware-aware implementations. The proposed schemes are integrated into the Versatile Video Coding (VVC) prediction pipeline, retaining compression efficiency of state-of-the-art chroma intra-prediction methods based on neural networks, while offering different directions for significantly reducing coding complexity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.