2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721)
DOI: 10.1109/asru.2003.1318474
|View full text |Cite
|
Sign up to set email alerts
|

Mel-cepstrum modulation spectrum (MCMS) features for robust ASR

Abstract: Abstract. In this paper, we present new dynamic features derived from the modulation spectrum of the cepstral trajectories of the speech signal. Cepstral trajectories are projected over the basis of sines and cosines yielding the cepstral modulation frequency response of the speech signal. We show that the different sines and cosines basis vectors select different modulation frequencies, whereas, the frequency responses of the delta and the double delta filters are only centered over 15Hz. Therefore, projectin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
15
0
1

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 34 publications
(16 citation statements)
references
References 11 publications
(16 reference statements)
0
15
0
1
Order By: Relevance
“…Some representative examples are temporal patterns (TRAPs) features (Hermansky and Sharma, 1998), MLPs and several modulation spectrum related techniques (Kingsbury et al, 1998;Milner, 1996;Tyagi et al, 2003;Zhu and Alwan, 2000). In this approach temporal trajectories of spectral energies in individual critical bands over windows as long as one second are used as features for pattern classification.…”
mentioning
confidence: 99%
“…Some representative examples are temporal patterns (TRAPs) features (Hermansky and Sharma, 1998), MLPs and several modulation spectrum related techniques (Kingsbury et al, 1998;Milner, 1996;Tyagi et al, 2003;Zhu and Alwan, 2000). In this approach temporal trajectories of spectral energies in individual critical bands over windows as long as one second are used as features for pattern classification.…”
mentioning
confidence: 99%
“…This results in the undesirable effect that the same QSS gets analyzed by successively smaller windows, hence increasing the variance of the feature vector of this QSS. On the other hand, the use of a shift size equal to the variable window size will change the Nyquist frequency of the cepstral modulation spectrum [7]. Therefore, the modulation frequency pass-band of the delta filters [7] will vary from frame to frame and may suffer from aliasing for shift sizes in excess of 20ms.…”
Section: Experiments and Resultsmentioning
confidence: 99%
“…HMM-GMM systems typically use spectral features based on a constant window size (typically 20ms) and a constant shift size (typically 10ms). The shift size determines the Nyquist frequency of the cepstral modulation spectrum [7], which is typically measured by the delta features of the static MFCC or PLP features. In a variable-scale piecewise quasi-stationary analysis, the shift size should preferably be equal to the size of the detected QSS.…”
Section: Experiments and Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…In this study, we have explored Mel cepstrum modulation spectrum (MCMS) [12] features together with MFCC features in the context of a forest. The motivation for using MCMS features is that they emphasize different cepstral modulation frequencies as opposed to first-and second-order derivative features that only emphasize modulation frequencies around 15 Hz.…”
Section: Multiple Representationsmentioning
confidence: 99%