Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.
DOI: 10.1109/icassp.2005.1415167
|View full text |Cite
|
Sign up to set email alerts
|

On desensitizing the Mel-Cepstrum to spurious spectral components for Robust Speech Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 42 publications
(22 citation statements)
references
References 6 publications
0
20
0
Order By: Relevance
“…The MFCCs are non-parametric representations of the audio signals and are used to model the human auditory perception system [9]. Therefore, MFCCs are useful for audio recognition [14]. This method had made important contributions in music retrieval to date.…”
Section: A) Music Content Representationmentioning
confidence: 99%
“…The MFCCs are non-parametric representations of the audio signals and are used to model the human auditory perception system [9]. Therefore, MFCCs are useful for audio recognition [14]. This method had made important contributions in music retrieval to date.…”
Section: A) Music Content Representationmentioning
confidence: 99%
“…Although the MFCC is known to be very efficient in characterizing the human auditory system, the MFCC values are not very robust in the actual environments, and so some researchers have proposed modifications to the basic MFCC algorithm. Especially, Tyagi and Wellekens suggested a method to desensitize the MFCC coefficients to spurious low-energy spectral perturbation and reported enhanced performance for speech recognition [5]. In this regard, it has been observed in the literature that no weights are applied to the MFCCs in all mel-filter bank indexes without taking full consideration of the relative importance of the MFCCs for gender identification [6,7].…”
Section: Introductionmentioning
confidence: 99%
“…Note that direct application of the steepest descent technique can not be allowed due to the constraints on the weights as specified in (5). Hence, we consider the following parameter transformationw…”
Section: Mfcc Weight Optimization Using Mce Trainingmentioning
confidence: 99%
“…In this paper, we set SNR at 40 dB. Then the training samples are processed to extract log Mel-filterbank (LMFB) [36] features followed by mean normalization. We also augment the LMFB with pitch-related features [37].…”
Section: System Overviewmentioning
confidence: 99%