ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9054074
|View full text |Cite
|
Sign up to set email alerts
|

Feedback Recurrent Autoencoder

Abstract: In this work, we propose a new recurrent autoencoder architecture, termed Feedback Recurrent AutoEncoder (FRAE), for online compression of sequential data with temporal dependency. The recurrent structure of FRAE is designed to efficiently extract the redundancy along the time dimension and allows a compact discrete representation of the data to be learned. We demonstrate its effectiveness in speech spectrogram compression. Specifically, we show that the FRAE, paired with a powerful neural vocoder, can produce… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 14 publications
(11 citation statements)
references
References 9 publications
0
11
0
Order By: Relevance
“…As for a general approach to source coding based on DVAEs with a sequence of latent variables, we only found the recent paper (Yang et al, 2020). The authors of this paper propose different schemes for encoding a data sequence x 1:T through the inference and quantization of the corresponding sequence of latent vectors z 1:T , with different options for recurrent connections.…”
Section: Perspectives In Source Codingmentioning
confidence: 99%
See 1 more Smart Citation
“…As for a general approach to source coding based on DVAEs with a sequence of latent variables, we only found the recent paper (Yang et al, 2020). The authors of this paper propose different schemes for encoding a data sequence x 1:T through the inference and quantization of the corresponding sequence of latent vectors z 1:T , with different options for recurrent connections.…”
Section: Perspectives In Source Codingmentioning
confidence: 99%
“…The authors of this paper propose different schemes for encoding a data sequence x 1:T through the inference and quantization of the corresponding sequence of latent vectors z 1:T , with different options for recurrent connections. One of them, called Feedback Recurrent AutoEncoder (FRAE), has recurrent connections at both the encoder and the decoder, and a feedback connection from decoder to encoder that is reminiscent of the classical closed-loop coding principle (Gersho and Gray, 2012) even if the authors of (Yang et al, 2020) do not refer to it explicitly. Interestingly, FRAE can be interpreted as a non-linear predictive coding scheme: In short, the encoder forms a latent code which encodes only the residual information missing to reconstruct a data vector from the decoder deterministic internal state.…”
Section: Perspectives In Source Codingmentioning
confidence: 99%
“…We compute the bitrate (bits-per-pixel, or BPP for short) and the reconstruction quality, which is measured in PSNR on 8-bit RGB space for all methods, averaged across all frames. We note that PSNR is a more challenging metric than MS-SSIM [64] for learned codecs [4], [10], [32], [58], [65], [66]. Since most existing neural compression methods assume video input in 8-bit RGB444 format (24 bits per pixel), we follow this convention by converting test videos from YUV420 to RGB444 in our evaluations for meaningful comparison.…”
Section: Training and Evaluation Proceduresmentioning
confidence: 99%
“…Additionally, a theoretical study of the sequence-to-sequence framework for time series forecasting, allowing to determine theoretical bounds have been performed by (Kuznetsov and Mariet, 2018). Last but not least, recurrent encoder-decoder architectures have been effectively employed for dimensionality reduction in the signal processing field (Yang et al, 2020), (Susik, 2020). For these reasons, we assess in this paper two recurrent autoencoders based on LSTM (Hochreiter and Schmidhuber, 1997) and GRU (Cho et al, 2014) units, respectively.…”
Section: Recurrent Autoencodersmentioning
confidence: 99%