1993
DOI: 10.1121/1.407520
|View full text |Cite
|
Sign up to set email alerts
|

Spectral-shape features versus formants as acoustic correlates for vowels

Abstract: The first three formants, i.e., the first three spectral prominences of the short-time magnitude spectra, have been the most commonly used acoustic cues for vowels ever since the work of Peterson and Barney [J. Acoust. Soc. Am. 24, 175-184 (1952)]. However, spectral shape features, which encode the global smoothed spectrum, provide a more complete spectral description, and therefore might be even better acoustic correlates for vowels. In this study automatic vowel classification experiments were used to compar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
72
0
1

Year Published

1995
1995
2013
2013

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 138 publications
(76 citation statements)
references
References 39 publications
3
72
0
1
Order By: Relevance
“…Briefly, these problems include the following: (1) the determinacy problem, as Bladon (1982) has called it, which is the commonplace idea that tracking formants in natural speech is a difficult and, as yet, unresolved problem; (2) the straightforward observation that perceptual confusions made by human listeners nearly always involve speech sounds that are phonetically quite similar, a pattern that is difficult to reconcile with an underlying formant tracking process that is susceptible to gross errors that occur when formants either split or merge 4 (Klatt, 1982; see also Ito, Tsuchida, & Yano, 2001); (3) evidence showing that spectral details other than formant frequencies can affect phonetic quality (e.g., Bladon, 1982;Chistovich & Lublinskaja, 1979;Hillenbrand & Nearey, 1999). Partially in response to these problems, a number of investigators have argued that phonetic recognition is mediated by mental computations of similarities and differences in the gross shape of the spectrum rather than by formant frequencies (e.g., Bladon & Lindblom, 1981;Hillenbrand & Houde, 2003;Zahorian & Jagharghi, 1993). The issues discussed above revolve around the question of formants versus gross spectral shape in the perception of phonetic quality rather than talker sex, but the same question can be asked about either issue.…”
Section: Discussionmentioning
confidence: 99%
“…Briefly, these problems include the following: (1) the determinacy problem, as Bladon (1982) has called it, which is the commonplace idea that tracking formants in natural speech is a difficult and, as yet, unresolved problem; (2) the straightforward observation that perceptual confusions made by human listeners nearly always involve speech sounds that are phonetically quite similar, a pattern that is difficult to reconcile with an underlying formant tracking process that is susceptible to gross errors that occur when formants either split or merge 4 (Klatt, 1982; see also Ito, Tsuchida, & Yano, 2001); (3) evidence showing that spectral details other than formant frequencies can affect phonetic quality (e.g., Bladon, 1982;Chistovich & Lublinskaja, 1979;Hillenbrand & Nearey, 1999). Partially in response to these problems, a number of investigators have argued that phonetic recognition is mediated by mental computations of similarities and differences in the gross shape of the spectrum rather than by formant frequencies (e.g., Bladon & Lindblom, 1981;Hillenbrand & Houde, 2003;Zahorian & Jagharghi, 1993). The issues discussed above revolve around the question of formants versus gross spectral shape in the perception of phonetic quality rather than talker sex, but the same question can be asked about either issue.…”
Section: Discussionmentioning
confidence: 99%
“…Again, while the role of spectral tilt in phonetic perception is unresolved (Hillenbrand et al, 1995;Bladon and Lindblom, 1981;Zahorian and Jagharghi, 1993;Ito et al, 2001), changes in tilt between speech segments have been shown to greatly influence speech perception by normal-hearing listeners (Alexander and Kluender, 2008), and especially hearing-impaired listeners (Alexander and Kluender, 2009) for whom spectral peaks are often reduced by abnormal cochlear filtering. The potential role that global spectral properties plays in speech perception is also demonstrated here with F 2 -matched precursors.…”
Section: A Summarymentioning
confidence: 99%
“…While the extent to which absolute spectral tilt plays a role in phonetic perception is generally unresolved (see, e.g., Hillenbrand et al, 1995;Bladon and Lindblom, 1981;Zahorian and Jagharghi, 1993;Ito et al, 2001), changes in tilt across time clearly influence perception of speech by normal-hearing listeners (Alexander and Kluender, 2008), and especially by hearing-impaired listeners (Alexander and Kluender, 2009) for whom spectral peaks are often obscured by abnormal cochlear processing. Furthermore, it is known that rapid changes in spectral tilt can substantially impair sentence identification (Van Dijkhuizen et al, 1987Haggard et al, 1987).…”
Section: Introductionmentioning
confidence: 99%
“…Having established the types of variation in formant trajectories that can be expected in cross-dialectal data in terms of TL, directionality, and curvature, a refinement of the current measures will be undertaken in order to address the changes in the direction of formant movement. In particular, parametrization procedures can be used ͑e.g., Harrington, 2006;Harrington et al, 2008;Hillenbrand et al, 2001;Morrison, 2009;Zahorian and Jagharghi, 1993͒ in order to model the various trajectory shapes.…”
Section: A Characterizing the Variation In Formant Dynamicsmentioning
confidence: 99%