Interspeech 2017 2017
DOI: 10.21437/interspeech.2017-1672
|View full text |Cite
|
Sign up to set email alerts
|

Speech Enhancement Using Bayesian Wavenet

Abstract: In recent years, deep learning has achieved great success in speech enhancement. However, there are two major limitations regarding existing works. First, the Bayesian framework is not adopted in many such deep-learning-based algorithms. In particular, the prior distribution for speech in the Bayesian framework has been shown useful by regularizing the output to be in the speech space, and thus improving the performance. Second, the majority of the existing methods operate on the frequency domain of the noisy … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
72
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 84 publications
(73 citation statements)
references
References 18 publications
1
72
0
Order By: Relevance
“…Since the conclusion of our experiments other more advanced neural-network based denoising techniques have been proposed, such as generative adversarial networks [51], Wavenet-style based systems [52], [53] and convolutional neural networks [54], [55]. The latter showing improvements upon RNN based methods [55] in terms of PESQ and STOI scores.…”
Section: Our Work In Contextmentioning
confidence: 88%
“…Since the conclusion of our experiments other more advanced neural-network based denoising techniques have been proposed, such as generative adversarial networks [51], Wavenet-style based systems [52], [53] and convolutional neural networks [54], [55]. The latter showing improvements upon RNN based methods [55] in terms of PESQ and STOI scores.…”
Section: Our Work In Contextmentioning
confidence: 88%
“…Wavenet [1] is an autoregressive convolutional neural network that produces raw audio waveforms by directly modeling the underlying probability distribution of audio samples. This has led to state-of-the-art performance in text-to-speech synthesis [2], [7], [17], [18], speech recognition [19], and other audio generation settings [1], [3], [4]. The Wavenet architecture aims to model the conditional probability among subsequent audio samples.…”
Section: B Wavenet and Autoregressive Cnnsmentioning
confidence: 99%
“…Autoregressive convolutional models achieve state-of-theart results in audio [1]- [4] and language domains [5], [6] with respect to both estimating the data distribution and generating high-quality samples. Wavenet [1] is an example of autoregressive convolutional network, used for modelling audio for applications such as text-to-speech (TTS) synthesis and music generation.…”
Section: Introductionmentioning
confidence: 99%
“…That is, these methods exploit the sophistication of the generative network model to find a better approximation of the clean signal waveform. For example, [26] approximates the clean signal waveform using a Bayesian formalism that incorporates the structure of WaveNet. [11] uses a WaveNet structure to create a deterministic mapping from the noisy waveform to the clean waveform approximation.…”
Section: Generative Enhancementmentioning
confidence: 99%