IEEE Workshop on Automatic Speech Recognition and Understanding, 2005. 2005
DOI: 10.1109/asru.2005.1566500
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised spectral subtraction for noise-robust ASR

Abstract: This paper proposes a simple, computationally efficient 2-mixture model approach to discriminate between speech and background noise at the magnitude spectrogram level. It is directly derived from observations on real data, and can be used in a fully unsupervised manner, with the EM algorithm. In this paper, the 2-mixture model is used in an "Unsupervised Spectral Subtraction" scheme that can be applied as a pre-processing step for any acoustic feature extraction scheme, such as MFCCs or PLP. The goal is to im… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
25
0

Year Published

2007
2007
2013
2013

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 26 publications
(25 citation statements)
references
References 17 publications
(20 reference statements)
0
25
0
Order By: Relevance
“…Each test utterance was corrupted with additive noise independently sampled from a Gaussian with zero mean and covariance σ 2 . The features for the HMM were computed using Unsupervised Spectral Subtraction (USS) [17], thereby providing filtered features to the HMM recogniser. The setup used for the feature-based HMM was the same as that used to obtain the baseline performance on the AURORA task [13], namely 18 states, left-right transition matrix, a mixture of three Gaussians per state and 39 MFCC features, including first and second temporal derivatives as well as energy.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Each test utterance was corrupted with additive noise independently sampled from a Gaussian with zero mean and covariance σ 2 . The features for the HMM were computed using Unsupervised Spectral Subtraction (USS) [17], thereby providing filtered features to the HMM recogniser. The setup used for the feature-based HMM was the same as that used to obtain the baseline performance on the AURORA task [13], namely 18 states, left-right transition matrix, a mixture of three Gaussians per state and 39 MFCC features, including first and second temporal derivatives as well as energy.…”
Section: Resultsmentioning
confidence: 99%
“…Whilst successful under controlled conditions, this standard approach is often particularly fragile in the presence of noise [17]. This important issue is commonly addressed by a preprocessing step which attempts to remove noise, see for example [17], [7], [12], [25] and [1].…”
Section: Introductionmentioning
confidence: 99%
“…Since the proposed approach involves working with specific frequency points in the spectrum, it might be directly coupled with a suitable time-frequency masking framework aimed at noise removal [32] or signal separation [33]. Finally, since the approach is data-driven, it could be applied to other related tasks like phoneme recognition.…”
Section: Discussionmentioning
confidence: 99%
“…Ris and Dupont [5] present a survey of methods to measure noise, favouring the low-energy envelope tracking approach of Martin [6]. Lathoud et al [7] present a statistical spectral model that yields both noise and speech estimates.…”
Section: Introductionmentioning
confidence: 99%
“…For instance, histogram normalisation, a logical progression of CMN/CVN to higher order moments, has been successfully combined with spectral compensation techniques by Segura et al [8]. Lathoud et al [7], who describe their technique as "Unsupervised" spectral subtraction (USS), also report good results in combination with cepstral normalisation.…”
Section: Introductionmentioning
confidence: 99%