Intra Frame Prediction for Video Coding Using a Conditional Autoencoder Approach

Brand, Fabian; Seiler, Jurgen; Kaup, André

doi:10.1109/pcs48520.2019.8954546

Cited by 15 publications

(9 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Using a spatially correlated latent space and the crosscomponent CAE chroma extensions, we save more than 0.5% rate in the luma and about 1% in the chroma components compared to current technology. In this paper we proved, that the conditional autoencoder for intra prediction, which we proposed in [14] is able to form a fully functional intra prediction system for all components, outperforming state-ofthe-art methods in terms of Bjøntegaard delta rate savings.…”

Section: Discussionmentioning

confidence: 86%

“…In [14], we furthermore have shown that the prediction quality can be increased by performing a second training, which refines only the decoder network and accustoms it to vector-quantized inputs. That way, we can compensate the lower variance of the simulated quantization noise.…”

Section: A General Conceptmentioning

confidence: 95%

“…Possible solutions for this problem are splitting the training set according to some criterion as in [11], or innovative training methods as in [5], training multiple networks jointly. In [14], we have proposed the conditional autoencoder for intra prediction, a novel method to generate an arbitrary number of modes using just one network. With this approach, instead of a prediction mode, we transmit a latent space representation which, in a very abstract way, contains instructions how to predict the block from its neighborhood.…”

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Intra-Frame Coding Using a Conditional Autoencoder

Brand

Seiler

Kaup

2021

IEEE J. Sel. Top. Signal Process.

Self Cite

View full text Add to dashboard Cite

Exploiting spatial redundancy in images is responsible for a large gain in the performance of image and video compression. The main tool to achieve this is called intra-frame prediction. In most state-of-the-art video coders, intra prediction is applied in a block-wise fashion. Up to now angular prediction was dominant, providing a low-complexity method covering a large variety of content. With deep learning, however, it is possible to create prediction methods covering a wider range of content, being able to predict structures which traditional modes can not predict accurately. Using the conditional autoencoder structure, we are able to train a single artificial neural network which is able to perform multi-mode prediction. In this paper, we derive the approach from the general formulation of the intraprediction problem and introduce two extensions for spatial mode prediction and for chroma prediction support. Moreover, we propose a novel latent-space-based cross component prediction. We show the power of our prediction scheme with visual examples and report average gains of 1.13% in Bjøntegaard delta rate in the luma component and 1.21% in the chroma component compared to VTM using only traditional modes.

show abstract

Section: Discussionmentioning

confidence: 86%

Section: A General Conceptmentioning

confidence: 95%

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Intra-Frame Coding Using a Conditional Autoencoder

Brand

Seiler

Kaup

2021

IEEE J. Sel. Top. Signal Process.

Self Cite

View full text Add to dashboard Cite

show abstract

“…VVC included the planar and DC modes already available in HEVC as the non-directional prediction options. The other 65 modes are directional, using different angles to predict the current block [49]. Additionally, there are other innovations, such as: wide-angle intra prediction (WAIP), used to apply directional intra modes to non-square blocks; multiple reference line prediction (MRL), allowing the use of more reference lines; intra sub-partitions (ISP), applied to explore correlations among intra-block samples; and matrixweighted intra prediction (MIP), which performs the prediction through matrix multiplications and sample interpolation.…”

Section: Intra-frame Predictionmentioning

confidence: 99%

Modern Video Coding: Methods, Challenges and Systems

Palau

Silveira

Domanski

et al. 2021

JICS

View full text Add to dashboard Cite

With the increasing demand for digital video applications in our daily lives, video coding and decoding become critical tasks that must be supported by several types of devices and systems. This paper presents a discussion of the main challenges to design dedicated hardware architectures based on modern hybrid video coding formats, such as the High Efficiency Video Coding (HEVC), the AOMedia Video 1 (AV1) and the Versatile Video Coding (VVC). The paper discusses eachstep of the hybrid video coding process, highlighting the main challenges for each codec and discussing the main hardware solutions published in the literature. The discussions presented in the paper show that there are still many challenges to be overcome and open research opportunities, especially for the AV1 and VVC codecs. Most of these challenges are related to the high throughput required for processing high and ultrahigh resolution videos in real time and to energy constraints of multimedia-capable devices.

show abstract

“…Using neural networks, conditional coding can be implemented as a conditional autoencoder. The first use of a conditional autoencoder in the context of image and video compression was proposed in [5] for intra prediction.…”

Section: Introductionmentioning

confidence: 99%

Generalized Difference Coder: A Novel Conditional Autoencoder Structure for Video Compression

Brand¹,

Seiler²,

Schober³

2021

Preprint

Self Cite

View full text Add to dashboard Cite

Motion compensated inter prediction is a common component of all video coders. The concept was established in traditional hybrid coding and successfully transferred to learning-based video compression. To compress the residual signal after prediction, usually the difference of the two signals is compressed using a standard autoencoder. However, information theory tells us that a general conditional coder is more efficient. In this paper, we provide a solid foundation based on information theory and Shannon entropy to show the potentials but also the limits of conditional coding. Building on those results, we then propose the generalized difference coder, a special case of a conditional coder designed to avoid limiting bottlenecks. With this coder, we are able to achieve average rate savings of 27.8% compared to a standard autoencoder, by only adding a moderate complexity overhead of less than 7%.

show abstract

Intra Frame Prediction for Video Coding Using a Conditional Autoencoder Approach

Cited by 15 publications

References 13 publications

Intra-Frame Coding Using a Conditional Autoencoder

Intra-Frame Coding Using a Conditional Autoencoder

Modern Video Coding: Methods, Challenges and Systems

Generalized Difference Coder: A Novel Conditional Autoencoder Structure for Video Compression

Contact Info

Product

Resources

About