IEEE International Conference on Acoustics Speech and Signal Processing 2002
DOI: 10.1109/icassp.2002.1005761
|View full text |Cite
|
Sign up to set email alerts
|

Non-linear transformations of the feature space for robust speech recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
12
0

Year Published

2005
2005
2014
2014

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(12 citation statements)
references
References 0 publications
0
12
0
Order By: Relevance
“…In (de la Torre et al, 2002;Skosan, 2005) it was shown that distortions caused by additive noise and linear filtering (distortions often encountered in telephone networks) corrupts spectral-based features (like those used in speech and speaker recognition systems) and, as a result, modifies their distributions-additive noise deforms the variances of these feature distributions while linear filtering alters the distribution means. In a previous contribution (Skosan and Mashao, 2004), we showed that a compensation technique, known as Histogram Equalization (HEQ), could be used to minimize the disparity between feature distributions collected in mismatched telephone environments.…”
Section: Introductionmentioning
confidence: 98%
“…In (de la Torre et al, 2002;Skosan, 2005) it was shown that distortions caused by additive noise and linear filtering (distortions often encountered in telephone networks) corrupts spectral-based features (like those used in speech and speaker recognition systems) and, as a result, modifies their distributions-additive noise deforms the variances of these feature distributions while linear filtering alters the distribution means. In a previous contribution (Skosan and Mashao, 2004), we showed that a compensation technique, known as Histogram Equalization (HEQ), could be used to minimize the disparity between feature distributions collected in mismatched telephone environments.…”
Section: Introductionmentioning
confidence: 98%
“…Cepstral mean subtraction (CMS) is a simple but effective way to remove the dc component of features. Several other well-known normalization methods for feature domain have been proposed, such as cumulative histogram used in histogram equalization (HEQ) [6] and cepstral shape normalization (CSN) [7]. On the other hand, it is observed that human auditory system is remarkably more robust than the state-of-the-art ASR systems in the presence of variable noises.…”
Section: Introductionmentioning
confidence: 99%
“…1 The relative performance of MSE versus MTE scheme does not solely depend on the signal-to-noise ratio in the frequency band. 2 Higher-order derivatives of the input signal correspond to larger values of p, [21].…”
Section: Medium and Short-time Properties Of Energy Operatorsmentioning
confidence: 99%
“…The effect of noise on the features employed in a speech recognition front-end is nontrivial and can greatly influence the overall system performance. In this context, much work has been done minimizing this mismatch [2], [3] by using transformations of the noisy features to a "cleaner" feature domain, and thus improving their invariability to certain noise types. Other related work includes speech enhancement [4], normalization of the noisy features statistical properties [5]- [7], and dynamic feature combinations [8].…”
mentioning
confidence: 99%