Interspeech 2011 2011
DOI: 10.21437/interspeech.2011-609
|View full text |Cite
|
Sign up to set email alerts
|

Improved tonal language speech recognition by integrating spectro-temporal evidence and pitch information with properly chosen tonal acoustic units

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2013
2013
2020
2020

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(2 citation statements)
references
References 0 publications
0
2
0
Order By: Relevance
“…Each feature dimension was Z-normalized per speaker. One additional experiment was performed with Model 1, in which its input feature vector was augmented by a fundamental frequency measurement (F0), because F0 has been shown to reduce ASR error rates for tonal languages [33,34]. F0 was extracted from the same 25ms windowed frame, converted from Hertz to Mel scale, then appended to the 40-dimensional log Mel features.…”
Section: Experimental Methodsmentioning
confidence: 99%
“…Each feature dimension was Z-normalized per speaker. One additional experiment was performed with Model 1, in which its input feature vector was augmented by a fundamental frequency measurement (F0), because F0 has been shown to reduce ASR error rates for tonal languages [33,34]. F0 was extracted from the same 25ms windowed frame, converted from Hertz to Mel scale, then appended to the 40-dimensional log Mel features.…”
Section: Experimental Methodsmentioning
confidence: 99%
“…Some papers have suggested extracting alternative features from the input signal. In [14], Li et al convert the input signal to a spectrogram and convolve it with a set of Gabor filters to yield a set of feature maps. Frame-level tone labels are obtained using forced alignment, and an MLP is trained to predict the tone label for each frame using the feature maps.…”
Section: Existing Approachesmentioning
confidence: 99%