2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2021
DOI: 10.1109/waspaa52581.2021.9632723
|View full text |Cite
|
Sign up to set email alerts
|

Harp-Net: Hyper-Autoencoded Reconstruction Propagation for Scalable Neural Audio Coding

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(3 citation statements)
references
References 9 publications
0
3
0
Order By: Relevance
“…Other coding architectures use quantized features from different layers of an autoencoder network to code speech at different bitrates [96]. For example, residual networks (ResNet) use "short-cuts" to pass information from one layer directly to a successor layer, as a bypass, while another approach [138] cascades residuals across a series of DNN modules.…”
Section: Residual Network Codingmentioning
confidence: 99%
“…Other coding architectures use quantized features from different layers of an autoencoder network to code speech at different bitrates [96]. For example, residual networks (ResNet) use "short-cuts" to pass information from one layer directly to a successor layer, as a bypass, while another approach [138] cascades residuals across a series of DNN modules.…”
Section: Residual Network Codingmentioning
confidence: 99%
“…In this work, we combine the merit from both FP-and BP-QAT and propose General Quantizer (GQ) that navigates weights to quantization centroids without introducing augmented regularizers but via feedforward-only operators. Our work is inspired by a continuous relaxation of quantization [25] also used for speech representation learning [26,27,28,29,30,31,32], and µ-Law algorithm for 8-bit pulse-code modulation (PCM) digital telecommunication [33].…”
Section: Related Qat Approachesmentioning
confidence: 99%
“…This has motivated data-driven approaches to train neural networks to perform speech coding. These networks leverage large amounts of training data while relaxing the assumptions made on the type of transformations applied by the system [3][4][5][6][7][8][9][10]. In particular, the SoundStream neural codec combines a causal convolutional architecture with a residual vector quantizer.…”
Section: Introductionmentioning
confidence: 99%