Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96
DOI: 10.1109/icslp.1996.607277
|View full text |Cite
|
Sign up to set email alerts
|

Cepstral compensation by polynomial approximation for environment-independent speech recognition

Abstract: Speech recognition systems perform poorly on speech degraded by even simple effects such as linear filtering and additive noise. One possible solution to this problem is to modify the probability density function (PDF) of clean speech to account for the effects of the degradation. However, even for the case of linear filtering and additive noise, it is extremely difficult to do this analytically. Previously attempted analytical solutions to the problem of noisy speech recognition have either used an overly-sim… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
13
0

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 17 publications
(13 citation statements)
references
References 7 publications
0
13
0
Order By: Relevance
“…This assumption is very important because it dramatically reduces the number of parameters to estimate and the amount of adaptation data required. Despite the fact that (27) to estimate d M is the same expression employed to estimate convolutional distortion (Acero & Stern, 1990) if additive noise is not present (Yoma, 1998-B), the methods in (Acero & Stern, 1990;Moreno et al, 1995;Raj et al, 1996) do not compensate the HMMs. Notice that the effect of the transfer function that represents a linear channel is supposed to be an additive constant in the log-cepstral domain.…”
Section: Compute Pr( )mentioning
confidence: 99%
See 2 more Smart Citations
“…This assumption is very important because it dramatically reduces the number of parameters to estimate and the amount of adaptation data required. Despite the fact that (27) to estimate d M is the same expression employed to estimate convolutional distortion (Acero & Stern, 1990) if additive noise is not present (Yoma, 1998-B), the methods in (Acero & Stern, 1990;Moreno et al, 1995;Raj et al, 1996) do not compensate the HMMs. Notice that the effect of the transfer function that represents a linear channel is supposed to be an additive constant in the log-cepstral domain.…”
Section: Compute Pr( )mentioning
confidence: 99%
“…al, 2006) proposes a model of the low bit rate coding-decoding distortion that is different from the model of the additive and convolutional noise, although they are similar to some extent. The mean and variance compensation is code-word dependent in (Acero & Stern, 1990;Moreno et al, 1995;Raj et al, 1996). In contrast, V are considered independent of the code-word in (Yoma et.…”
Section: Compute Pr( )mentioning
confidence: 99%
See 1 more Smart Citation
“…Speaker robust technologies comprise speaker normalization and model adaptation, such as speaker-adaptive training (SAT) [3], vocal tract length normalization (VTLN) [4]. Noise robustness and compensation technologies comprise the parallel model combination (PMC) [5], Vector Taylor Series (VTS/VPS) [6], [7], Quantile based Histogram Equalization (HEQ) [8], Switching Linear Dynamic Model (SLDM) [9], and etc. Other types of variations also attract more and more attentions, such as the emotion affected speech recognition (EASR).…”
Section: Introductionmentioning
confidence: 99%
“…Other related work includes speech enhancement [4], normalization of the noisy features statistical properties [5]- [7], and dynamic feature combinations [8]. In [9]- [11], the effect of environmental noise on the statistical speech models was investigated and two algorithms (CDCN and MFCDCN) were proposed for compensating it. However, the feature robustness problem remains unsolved in a globally optimal way.…”
mentioning
confidence: 99%