1999
DOI: 10.1121/1.424675
|View full text |Cite
|
Sign up to set email alerts
|

Missing-data model of vowel identification

Abstract: Vowel identity correlates well with the shape of the transfer function of the vocal tract, in particular the position of the first two or three formant peaks. However, in voiced speech the transfer function is sampled at multiples of the fundamental frequency (F0), and the short-term spectrum contains peaks at those frequencies, rather than at formants. It is not clear how the auditory system estimates the original spectral envelope from the vowel waveform. Cochlear excitation patterns, for example, resolve ha… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
13
0

Year Published

2001
2001
2023
2023

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 41 publications
(13 citation statements)
references
References 53 publications
0
13
0
Order By: Relevance
“…As to the possibility of signal distortion, Miller et al (2010) proposed the explanation that when the synthesized f0 is considerably higher than the original f0, the formant peaks of the vowels are reduced due to misalignment of harmonics of f0 with vocal-tract resonances (de Cheveigné & Kawahara, 1999;Diehl, Lindblom, Hoemeke, & Fahey, 1996). This distortion could lower speech intelligibility, particularly in background noise.…”
Section: Discussionmentioning
confidence: 99%
“…As to the possibility of signal distortion, Miller et al (2010) proposed the explanation that when the synthesized f0 is considerably higher than the original f0, the formant peaks of the vowels are reduced due to misalignment of harmonics of f0 with vocal-tract resonances (de Cheveigné & Kawahara, 1999;Diehl, Lindblom, Hoemeke, & Fahey, 1996). This distortion could lower speech intelligibility, particularly in background noise.…”
Section: Discussionmentioning
confidence: 99%
“…Human listeners are little affected by large changes in F0 despite the sparse sampling at higher F0's (e.g., Assmann and Katz, 2000). One promising approach is the "missing information" matching of spectral templates, in which only frequencies at which a harmonic is present contribute to the output (de Cheveign e and Kawahara, 1999). In this way, entire formants can be unrepresented by harmonics, but their absence does not affect identification rates.…”
Section: Discussionmentioning
confidence: 99%
“…Thus, the optimal resolution for vowel quality perception may be found in the low-spectral-, high-temporal-resolution channel rather than the high-spectral-, lowtemporal-resolution channel ͑but cf. Cheveigné and Kawahara, 1999;Hillenbrand and Houde, 2003͒. A number of behavioral methods could be used to experimentally test the temporal resolution of vowel quality perception. Past research has investigated vowel perception by measuring the temporal limits of forward and backward masking.…”
Section: Introductionmentioning
confidence: 99%