2013
DOI: 10.1109/tasl.2013.2270369
|View full text |Cite
|
Sign up to set email alerts
|

Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization

Abstract: Abstract-Reducing the interference noise in a monaural noisy speech signal has been a challenging task for many years. Compared to traditional unsupervised speech enhancement methods, e.g., Wiener filtering, supervised approaches, such as algorithms based on hidden Markov models (HMM), lead to higher-quality enhanced speech signals. However, the main practical difficulty of these approaches is that for each noise type a model is required to be trained a priori. In this paper, we investigate a new class of supe… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
221
0
2

Year Published

2014
2014
2022
2022

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 365 publications
(227 citation statements)
references
References 40 publications
0
221
0
2
Order By: Relevance
“…Unlike traditional speech enhancement techniques (e.g., [1][2][3][4][5]), which focus on dealing with the noise-corrupted speech signal (i.e., speech-plus-noise mixture) and on removing background noise from the signal to achieve better listening experiences for listeners, these speech modification algorithms aim to alter the original clean speech signal so that the intelligibility may be preserved even when listened to in non-ideal listening conditions, in which background masking sources may exist. While the majority of modification algorithms operate in the frequency domain, such as enhancing frequency components which are important to speech intelligibility in noise [6][7][8] and boosting certain spectral regions based on optimising objective intelligibility metrics [9][10][11][12], other algorithms make changes in the time domain, including introducing pauses into speech and speeding up or slowing down part of the speech to avoid a temporal clash between the speech and masker [10,13].…”
Section: Introductionmentioning
confidence: 99%
“…Unlike traditional speech enhancement techniques (e.g., [1][2][3][4][5]), which focus on dealing with the noise-corrupted speech signal (i.e., speech-plus-noise mixture) and on removing background noise from the signal to achieve better listening experiences for listeners, these speech modification algorithms aim to alter the original clean speech signal so that the intelligibility may be preserved even when listened to in non-ideal listening conditions, in which background masking sources may exist. While the majority of modification algorithms operate in the frequency domain, such as enhancing frequency components which are important to speech intelligibility in noise [6][7][8] and boosting certain spectral regions based on optimising objective intelligibility metrics [9][10][11][12], other algorithms make changes in the time domain, including introducing pauses into speech and speeding up or slowing down part of the speech to avoid a temporal clash between the speech and masker [10,13].…”
Section: Introductionmentioning
confidence: 99%
“…(7) corresponds to a non-negative matrix factorization (NMF) model placed on the F × L matrix of variances of the source coefficients; a now common practice in audio signal processing, e.g. [16,17,18].…”
Section: The Source Modelmentioning
confidence: 99%
“…Statistical properties based approaches like minimum mean squared error (MMSE) estimation and optimally-modified log-spectral amplitude (OM-LSA) which could take human hearing properties into account and reduce speech distortion and residual noise in some extent [1] . Recent years, supervised learning methods have achieved significant development in speech signal processing [2][3] . As a famous method which could mine implicit local representation in non negative data, non-negative matrix factorization (NMF) uses no-negative linear combination to separate clean and noise signal from noisy speech.…”
Section: Introductionmentioning
confidence: 99%