1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258) 1999
DOI: 10.1109/icassp.1999.759780
|View full text |Cite
|
Sign up to set email alerts
|

Improved methods for vocal tract normalization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
32
0

Year Published

2000
2000
2019
2019

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 52 publications
(36 citation statements)
references
References 5 publications
1
32
0
Order By: Relevance
“…During both training and testing a grid search over 21 warping factors evenly distributed, with step 0.02, in the range 0.80-1.20, was performed. The training and recognition procedures adopted for implementing VTLN follow closely those proposed in (Welling et al, 1999) and are described in detail in (Giuliani et al, 2006).…”
Section: Vtlnmentioning
confidence: 99%
“…During both training and testing a grid search over 21 warping factors evenly distributed, with step 0.02, in the range 0.80-1.20, was performed. The training and recognition procedures adopted for implementing VTLN follow closely those proposed in (Welling et al, 1999) and are described in detail in (Giuliani et al, 2006).…”
Section: Vtlnmentioning
confidence: 99%
“…However, the computation time is more than doubled. Using Ï instead of the actually spoken, but unknown transcription Ï does not degrade the recognition performance even if the preliminary transcription has a large word error rate in the order of 20 to 30% (Welling et al, 1999). …”
Section: Vtn Principlesmentioning
confidence: 99%
“…To estimate the warping factor of the test speaker without a preliminary recognition pass, Lee et al (1996) and Welling et al (1999) suggested a text-independent method using Gaussian mixture models. The approach used here relies on a separate emission distribution ÈÖ´ « µ for each warping factor «, where denotes the sequence of acoustic vectors and « denotes the «-dependent distribution parameters.…”
Section: Fast Vtnmentioning
confidence: 99%
See 1 more Smart Citation
“…Furthermore, vocal tract length normalization (VTLN) is applied to the MFCC features. The VTLN warping factors are obtained from a Gaussian classifier (fast-VTLN) [18]. In addition, we perform speaker adaptation with constrained maximum likelihood linear regression (CMLLR) [19].…”
Section: Trainingmentioning
confidence: 99%