[Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing 1991
DOI: 10.1109/icassp.1991.150401
|View full text |Cite
|
Sign up to set email alerts
|

Acoustic-phonetic transformations for improved speaker-independent isolated word recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
3
0

Year Published

1992
1992
2023
2023

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 13 publications
(6 citation statements)
references
References 3 publications
0
3
0
Order By: Relevance
“…In addition, in order to improve the distinguishability and robustness of acoustic features, reduce feature dimensions, and meet the actual needs of continuous speechrecognition systems, researchers have also proposed a variety of feature transformation methods [22][23][24][25][26][27][28][29][30][31][32]; the relevant research is shown in Table 1. LDA, Heteroscedastic Discriminant Analysis (HDA), Generalized Likelihood Ratio Discriminant Analysis (GLRDA), etc., can improve feature discrimination and reduce feature dimensions; Maximum Likelihood Linear Regression (MLLR), fMLLR, and Vocal Tract Length Normalization (VTLN) can eliminate speech information that has nothing to do with the recognition result, such as people or soundtrack, improving the robustness of features.…”
Section: General Approach To Feature Extractionmentioning
confidence: 99%
See 1 more Smart Citation
“…In addition, in order to improve the distinguishability and robustness of acoustic features, reduce feature dimensions, and meet the actual needs of continuous speechrecognition systems, researchers have also proposed a variety of feature transformation methods [22][23][24][25][26][27][28][29][30][31][32]; the relevant research is shown in Table 1. LDA, Heteroscedastic Discriminant Analysis (HDA), Generalized Likelihood Ratio Discriminant Analysis (GLRDA), etc., can improve feature discrimination and reduce feature dimensions; Maximum Likelihood Linear Regression (MLLR), fMLLR, and Vocal Tract Length Normalization (VTLN) can eliminate speech information that has nothing to do with the recognition result, such as people or soundtrack, improving the robustness of features.…”
Section: General Approach To Feature Extractionmentioning
confidence: 99%
“…LDA, Heteroscedastic Discriminant Analysis (HDA), Generalized Likelihood Ratio Discriminant Analysis (GLRDA), etc., can improve feature discrimination and reduce feature dimensions; Maximum Likelihood Linear Regression (MLLR), fMLLR, and Vocal Tract Length Normalization (VTLN) can eliminate speech information that has nothing to do with the recognition result, such as people or soundtrack, improving the robustness of features. Discrimination and dimensionality reduction [22] TI/NBS connected digit database [23] A 30-word single syllable highly confusable vocabulary [24] CVC syllables database [25] HAD TI-DIGITS Promote LDA to deal with heteroscedasticity Discrimination and dimensionality reduction [26] GLRDA 200 h of MATBN Mandarin television news…”
Section: General Approach To Feature Extractionmentioning
confidence: 99%
“…Linear discriminant analysis is a well-known technique in statistical pattern recognition which has been applied recently to speech recognition to improve recognition performance ͑Brown, 1987; Yu et al, 1990;Zahorian et al, 1991;Hunt and Lefebre, 1989;Umbach and Ney, 1992͒. The idea in linear discriminant analysis ͑LDA͒ is to find a transformation matrix which projects feature vectors from an n-dimensional space to an m-dimensional space (mϽn) such that a class separability criterion is maximized.…”
Section: Introductionmentioning
confidence: 99%
“…For both training and testing data, the modified Discrete Cosine Transformation Coefficients (DCTC) and Discrete Cosine Series Coefficients (DCSC) (Zahorian et al 1991;Zahorian et al, 1997;Zahorian et al, 2002;Karnjanadecha & Zahorian, 1999) were extracted as original features. The modified DCTC is used for representing speech spectra, and the modified DCSC is used to represent spectral trajectories.…”
Section: Dctc/dcsc Speech Featuresmentioning
confidence: 99%