Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181
DOI: 10.1109/icassp.1998.675367
|View full text |Cite
|
Sign up to set email alerts
|

Towards speech rate independence in large vocabulary continuous speech recognition

Abstract: Speech Technology Group Telefhica Investigaci6n y Desarrolio, S.A. Unipersonal 28043 -Madrid (Spain)In this paper we present a new speech rate classifier (SRC)which is directly baseld on the dynamic coefficients of the feature vectors and it is suitable to be used in real time. We also report the study that has been carried out to determine what parameters of speech are the best regarding the speech rate classification problem. In this study we analyse the correlation between several speech parameters and the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0
2

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 27 publications
(18 citation statements)
references
References 6 publications
0
16
0
2
Order By: Relevance
“…A first step toward addressing this issue, i.e., to help improve the match between the models used and the speech being processed for recognition, is to quantify the inherent speech rate variability. Then, once an estimation of the underlying speech rate is done, one could select appropriately pretrained acoustic models [25], [54] or adaptively set transition probabilities of the HMMs [4], [5] that appropriately reflect the rate of the speech being measured.…”
Section: A Significancementioning
confidence: 99%
“…A first step toward addressing this issue, i.e., to help improve the match between the models used and the speech being processed for recognition, is to quantify the inherent speech rate variability. Then, once an estimation of the underlying speech rate is done, one could select appropriately pretrained acoustic models [25], [54] or adaptively set transition probabilities of the HMMs [4], [5] that appropriately reflect the rate of the speech being measured.…”
Section: A Significancementioning
confidence: 99%
“…The acoustic changes, such as coarticulation, are modeled by directly adapting the acoustic models (or a subset of their parameters, i.e. weights and transition probabilities) to the different speaking rates (Bard et al, 2001;Martinez et al, 1998;Morgan et al, 1997;Shinozaki and Furui, 2003;Zheng et al, 2000). Most of the approaches are based on a separation of the training material into discrete speaking rate classes, which are then used for the training of rate dependent models.…”
Section: Pronunciation Modeling Techniquesmentioning
confidence: 99%
“…These experiments were motivated by the fact that the analysis on phone duration, presented in Section 3, revealed that adults and children in the speech corpora used in this work presented a very different mean phone duration. It can be hypothesized that the effect of the speaking rate is mostly concentrated on the first and second order time derivatives of the MFCCs (Martinez et al, 1998), therefore performing mean and variance normalization of dynamic features could be useful to compensate for very different speaking rates.…”
Section: Results Ofmentioning
confidence: 99%