ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
DOI: 10.1109/icassp.2019.8683341
|View full text |Cite
|
Sign up to set email alerts
|

Learning to Dequantize Speech Signals by Primal-dual Networks: an Approach for Acoustic Sensor Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
7
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
2

Relationship

2
3

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 15 publications
0
7
0
Order By: Relevance
“…15. This loss has shown superior performance compared to the MSE loss in speech enhancement [29], as well as for quantized speech reconstruction [30]. Some more detail is given in Appendix 1.…”
Section: Baseline Pw-filtmentioning
confidence: 99%
See 1 more Smart Citation
“…15. This loss has shown superior performance compared to the MSE loss in speech enhancement [29], as well as for quantized speech reconstruction [30]. Some more detail is given in Appendix 1.…”
Section: Baseline Pw-filtmentioning
confidence: 99%
“…The parameters of the deep learning architectures are then optimized by minimizing the MSE between the inferred results and their corresponding targets. In reality, optimization of the MSE loss in training does not guarantee any perceptual quality of the speech component and of the residual noise component, respectively, which leads to limited performance [27][28][29][30][31][32][33][34][35][36]. This effect is even more evident when the level of the noise component is significantly higher than that of the speech component in some regions of the noisy speech spectrum, which explains the bad performance at lower SNR conditions when training with MSE.…”
Section: Introductionmentioning
confidence: 99%
“…The perceptual weighting filter is originally used to shape the coding noise / quantization error to be less audible by exploiting the masking property of the human ear [27]. Motivated by the success of the weighting filter when used to shape coding noise, we apply it in this work to design the loss function to achieve improved perceptual quality in the context of speech enhancement, aiming to combat acoustic background noise (instead of coding noise, as has been done in [28]). We propose to extract the frequency response of the weighting filter from the clean speech and directly apply it to the loss function, resulting in an unaltered DNN architecture with no need to change the topology as in [23].…”
mentioning
confidence: 99%
“…Even more recently, a deep-learning-based method has been published [16]. Nonetheless, the classical approaches have not yet been fully explored, which is why the goal of this paper is to evaluate a number of sparsity-based audio dequantization methods.…”
Section: Introductionmentioning
confidence: 99%