2001
DOI: 10.1121/1.1384908
|View full text |Cite
|
Sign up to set email alerts
|

On the effectiveness of whole spectral shape for vowel perception

Abstract: The formant hypothesis of vowel perception, where the lowest two or three formant frequencies are essential cues for vowel quality perception, is widely accepted. There has, however, been some controversy suggesting that formant frequencies are not sufficient and that the whole spectral shape is necessary for perception. Three psychophysical experiments were performed to study this question. In the first experiment, the first or second formant peak of stimuli was suppressed as much as possible while still main… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

4
34
0

Year Published

2004
2004
2015
2015

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 47 publications
(38 citation statements)
references
References 18 publications
4
34
0
Order By: Relevance
“…Therefore, the receptive area for each Japanese vowel in the F1-F2 space is larger than in other languages, with more vowels such as French and Norwegian (Crothers, 1978). This circumstance implies that there are larger ambiguous receptive areas across adjacent vowels (Ito et al, 2001;Ueda & Watanabe, 1987;Fig. 1).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, the receptive area for each Japanese vowel in the F1-F2 space is larger than in other languages, with more vowels such as French and Norwegian (Crothers, 1978). This circumstance implies that there are larger ambiguous receptive areas across adjacent vowels (Ito et al, 2001;Ueda & Watanabe, 1987;Fig. 1).…”
Section: Discussionmentioning
confidence: 99%
“…Only F0, F1 and F2 were variable, whereas all of the other parameters were constant (F3 = 2500 Hz, F4 = 3500 Hz, F5 = 4500 Hz, duration 400 ms, including a rise/fall of 10/200 ms, binaural presentation with the intensity of 80 dBSPL; Table 1). The F1 and F2 of every tone were chosen within a frequency range of vowels that a healthy human can generate (Ito, Tsuchida, & Yano, 2001;Kewley-Port & Watson, 1994;Peterson & Barney, 1952). The relative intensities of the formant frequencies with respect to the fundamental frequencies were defined by the default parameters of the HLsyn: the area of the glottis 4; the area at the lips 100; the area of the tongue blade 100; the area of the velopharyngeal 0; and active vocal-tract expansion 0.…”
Section: Tonesmentioning
confidence: 99%
“…Despite the widespread use of the formant pattern as an explanatory concept in speech perception, and the numerous virtues of formant representations, the idea is not without some troublesome problems, which have been noted by a number of investigators. Briefly, these problems include the following: (1) the determinacy problem, as Bladon (1982) has called it, which is the commonplace idea that tracking formants in natural speech is a difficult and, as yet, unresolved problem; (2) the straightforward observation that perceptual confusions made by human listeners nearly always involve speech sounds that are phonetically quite similar, a pattern that is difficult to reconcile with an underlying formant tracking process that is susceptible to gross errors that occur when formants either split or merge 4 (Klatt, 1982; see also Ito, Tsuchida, & Yano, 2001); (3) evidence showing that spectral details other than formant frequencies can affect phonetic quality (e.g., Bladon, 1982;Chistovich & Lublinskaja, 1979;Hillenbrand & Nearey, 1999). Partially in response to these problems, a number of investigators have argued that phonetic recognition is mediated by mental computations of similarities and differences in the gross shape of the spectrum rather than by formant frequencies (e.g., Bladon & Lindblom, 1981;Hillenbrand & Houde, 2003;Zahorian & Jagharghi, 1993).…”
Section: Discussionmentioning
confidence: 99%
“…While the extent to which absolute spectral tilt plays a role in phonetic perception is generally unresolved (see, e.g., Hillenbrand et al, 1995;Bladon and Lindblom, 1981;Zahorian and Jagharghi, 1993;Ito et al, 2001), changes in tilt across time clearly influence perception of speech by normal-hearing listeners (Alexander and Kluender, 2008), and especially by hearing-impaired listeners (Alexander and Kluender, 2009) for whom spectral peaks are often obscured by abnormal cochlear processing. Furthermore, it is known that rapid changes in spectral tilt can substantially impair sentence identification (Van Dijkhuizen et al, 1987Haggard et al, 1987).…”
Section: Introductionmentioning
confidence: 99%