1987
DOI: 10.1121/1.2024783
|View full text |Cite
|
Sign up to set email alerts
|

Speaker-independent automatic vowel recognition based on overall spectral shape versus formants

Abstract: Automatic recognition experiments were performed to compare overall spectral shape versus formants as speaker-independent acoustic parameters for vowel identity. Stimuli consisted of four repetitions of 11 vowels spoken by 17 female speakers and 12 male speakers (29*11*4 = 1276 total stimuli). Formants were computed automatically by peak picking of 12th-order LP model spectra. Spectral shape was represented using three methods: (1) by a cosine basis vector expansion of the power spectrum: (2) as the output of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

1988
1988
1999
1999

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 0 publications
0
1
0
Order By: Relevance
“…Caution must taken in comparing the filter-bank results with the DCTC and formant results, since the stimuli used for the filter bank case consisted of sustained steadystate vowels only, rather than vowels for a variety of consonant contexts. However, in previous work for which the stimuli were identical across conditions (although fewer in number), we found that automatic recognition results for vowels were very similar for either filter bank data or DCTC data [7].…”
Section: Acoustic-phonetic Transformationmentioning
confidence: 48%
“…Caution must taken in comparing the filter-bank results with the DCTC and formant results, since the stimuli used for the filter bank case consisted of sustained steadystate vowels only, rather than vowels for a variety of consonant contexts. However, in previous work for which the stimuli were identical across conditions (although fewer in number), we found that automatic recognition results for vowels were very similar for either filter bank data or DCTC data [7].…”
Section: Acoustic-phonetic Transformationmentioning
confidence: 48%