Feedback Recurrent Autoencoder

Yang, Yang; Sautiere, Guillaume; Ryu, Je-Hwan; Cohen, Taco

doi:10.1109/icassp40776.2020.9054074

Cited by 14 publications

(11 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As for a general approach to source coding based on DVAEs with a sequence of latent variables, we only found the recent paper (Yang et al, 2020). The authors of this paper propose different schemes for encoding a data sequence x 1:T through the inference and quantization of the corresponding sequence of latent vectors z 1:T , with different options for recurrent connections.…”

Section: Perspectives In Source Codingmentioning

confidence: 99%

“…The authors of this paper propose different schemes for encoding a data sequence x 1:T through the inference and quantization of the corresponding sequence of latent vectors z 1:T , with different options for recurrent connections. One of them, called Feedback Recurrent AutoEncoder (FRAE), has recurrent connections at both the encoder and the decoder, and a feedback connection from decoder to encoder that is reminiscent of the classical closed-loop coding principle (Gersho and Gray, 2012) even if the authors of (Yang et al, 2020) do not refer to it explicitly. Interestingly, FRAE can be interpreted as a non-linear predictive coding scheme: In short, the encoder forms a latent code which encodes only the residual information missing to reconstruct a data vector from the decoder deterministic internal state.…”

Section: Perspectives In Source Codingmentioning

confidence: 99%

See 1 more Smart Citation

Dynamical Variational Autoencoders: A Comprehensive Review

Girin,

Leglaive,

Bie

et al. 2020

Preprint

View full text Add to dashboard Cite

The Variational Autoencoder (VAE) is a powerful deep generative model that is now extensively used to represent high-dimensional complex data via a lowdimensional latent space that is learned in an unsupervised manner. In the original VAE model, input data vectors are processed independently. In the recent years, a series of papers have presented different extensions of the VAE to sequential data, that not only model the latent space, but also model the temporal dependencies within a sequence of data vectors and/or corresponding latent vectors, relying on recurrent neural networks or state space models. In this paper we perform an extensive literature review of these models. Importantly, we introduce and discuss a general class of models called Dynamical Variational Autoencoders (DVAEs) that encompass a large subset of these temporal VAE extensions. Then we present in details seven different instances of DVAE that were recently proposed in the literature, with an effort to homogenize the notations and presentation lines, as well as to relate those models with existing classical temporal models (that are also presented for the sake of completeness). We reimplemented those seven DVAE models and we present the results of an experimental benchmark that we conducted on the speech analysis-resynthesis task (the PyTorch code will be made publicly available). An extensive discussion is presented at the end of the paper, aiming to comment on important issues concerning the DVAE class of models and to describe future research guidelines.

show abstract

Section: Perspectives In Source Codingmentioning

confidence: 99%

Section: Perspectives In Source Codingmentioning

confidence: 99%

Dynamical Variational Autoencoders: A Comprehensive Review

Girin,

Leglaive,

Bie

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

“…We compute the bitrate (bits-per-pixel, or BPP for short) and the reconstruction quality, which is measured in PSNR on 8-bit RGB space for all methods, averaged across all frames. We note that PSNR is a more challenging metric than MS-SSIM [64] for learned codecs [4], [10], [32], [58], [65], [66]. Since most existing neural compression methods assume video input in 8-bit RGB444 format (24 bits per pixel), we follow this convention by converting test videos from YUV420 to RGB444 in our evaluations for meaningful comparison.…”

Section: Training and Evaluation Proceduresmentioning

confidence: 99%

Insights from Generative Modeling for Neural Video Compression

Yang¹,

Yang²,

Marino³

et al. 2021

Preprint

View full text Add to dashboard Cite

While recent machine learning research has revealed connections between deep generative models such as VAEs and rate-distortion losses used in learned compression, most of this work has focused on images. In a similar spirit, we view recently proposed neural video coding algorithms through the lens of deep autoregressive and latent variable modeling. We present recent neural video codecs as instances of a generalized stochastic temporal autoregressive transform, and propose new avenues for further improvements inspired by normalizing flows and structured priors. We propose several architectures that yield state-of-the-art video compression performance on full-resolution video and discuss their tradeoffs and ablations. In particular, we propose (i) improved temporal autoregressive transforms, (ii) improved entropy models with structured and temporal dependencies, and (iii) variable bitrate versions of our algorithms. Since our improvements are compatible with a large class of existing models, we provide further evidence that the generative modeling viewpoint can advance the neural video coding field.

show abstract

“…Additionally, a theoretical study of the sequence-to-sequence framework for time series forecasting, allowing to determine theoretical bounds have been performed by (Kuznetsov and Mariet, 2018). Last but not least, recurrent encoder-decoder architectures have been effectively employed for dimensionality reduction in the signal processing field (Yang et al, 2020), (Susik, 2020). For these reasons, we assess in this paper two recurrent autoencoders based on LSTM (Hochreiter and Schmidhuber, 1997) and GRU (Cho et al, 2014) units, respectively.…”

Section: Recurrent Autoencodersmentioning

confidence: 99%

Factor-Based Framework for Multivariate and Multi-step-ahead Forecasting of Large Scale Time Series

Stefani

Bontempi

2021

Front. Big Data

View full text Add to dashboard Cite

State-of-the-art multivariate forecasting methods are restricted to low dimensional tasks, linear dependencies and short horizons. The technological advances (notably the Big data revolution) are instead shifting the focus to problems characterized by a large number of variables, non-linear dependencies and long forecasting horizons. In the last few years, the majority of the best performing techniques for multivariate forecasting have been based on deep-learning models. However, such models are characterized by high requirements in terms of data availability and computational resources and suffer from a lack of interpretability. To cope with the limitations of these methods, we propose an extension to the DFML framework, a hybrid forecasting technique inspired by the Dynamic Factor Model (DFM) approach, a successful forecasting methodology in econometrics. This extension improves the capabilities of the DFM approach, by implementing and assessing both linear and non-linear factor estimation techniques as well as model-driven and data-driven factor forecasting techniques. We assess several method integrations within the DFML, and we show that the proposed technique provides competitive results both in terms of forecasting accuracy and computational efficiency on multiple very large-scale (>102 variables and > 103 samples) real forecasting tasks.

show abstract

Feedback Recurrent Autoencoder

Cited by 14 publications

References 9 publications

Dynamical Variational Autoencoders: A Comprehensive Review

Dynamical Variational Autoencoders: A Comprehensive Review

Insights from Generative Modeling for Neural Video Compression

Factor-Based Framework for Multivariate and Multi-step-ahead Forecasting of Large Scale Time Series

Contact Info

Product

Resources

About