1999
DOI: 10.1016/s0167-6393(99)00002-3
|View full text |Cite
|
Sign up to set email alerts
|

On the relative importance of various components of the modulation spectrum for automatic speech recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

12
77
1
1

Year Published

2004
2004
2015
2015

Publication Types

Select...
6
2
2

Relationship

0
10

Authors

Journals

citations
Cited by 120 publications
(91 citation statements)
references
References 10 publications
12
77
1
1
Order By: Relevance
“…It is a bandpass filter that removes the very low frequency and high frequency components of feature trajectories. The design agrees with research findings that speech modulation frequency of 1-16Hz is most important for both human and automatic speech recognition [62][63][64][65][66][67][68]. RASTA and CMN are both able to reduce channel distortions, and they can be used in concatenation to produce better results.…”
Section: Temporal Filteringsupporting
confidence: 86%
“…It is a bandpass filter that removes the very low frequency and high frequency components of feature trajectories. The design agrees with research findings that speech modulation frequency of 1-16Hz is most important for both human and automatic speech recognition [62][63][64][65][66][67][68]. RASTA and CMN are both able to reduce channel distortions, and they can be used in concatenation to produce better results.…”
Section: Temporal Filteringsupporting
confidence: 86%
“…In this paper, we use an LPF cut-off frequency of 20 Hz in both equations because an important modulation region for speech perception [15] and speech recognition is from 1 to 16 Hz [16,17].…”
Section: Extraction Of the Power Envelopementioning
confidence: 99%
“…This fact is further confirmed by automatic speech recognition (ASR) experiments [3] which shows that the low modulation frequency bands (0-1Hz) and high modulation frequency bands are harmful (or useless) for ASR. The most widely used cepstral-based features in state-of-art ASR system, however, are typically sampled at a rate of 100Hz, giving a 50Hz bandwidth for representing modulation energy, something that is overkill.…”
Section: Introductionmentioning
confidence: 71%