Temporal properties associated with the speech signal are potentially important for understanding spoken language. Five hours of spontaneous American English dialogue material (from the SWITCHBOARD corpus) were hand-labeled and segmented at the phonetic-segment level; a fortyfive-minute subset was also manually annotated (at the syllabic level) with respect to stress accent.
There is a systematic relationship between stress accent and vocalic identity in spontaneous English discourse (the Switchboard corpus of telephone dialogues). Low vowels are much more likely to be fully accented than their high vocalic counterparts. And conversely, high vowels are far more likely to lack stress accent than low or mid vocalic segments. Such patterns imply that stress accent and vowel height are bound together at some level of lexical representation. Vocalic duration appears to be the primary acoustic cue associated with stress accent, and the association between vowel height and accent level is most clearly observed in this dimension, particularly for diphthongs and the low, tense monophthongs. Together, the data suggest that vocalic duration plays an exceedingly important role in understanding spoken language.
Phonemic models of spoken language are incapable of accommodating the patterns of pronunciation variation observed in spontaneous speech (as exemplified by a corpus of American English telephone dialogues, a.k.a. SWITCHBOARD). Variation in pronunciation with respect to segmental identity and duration can be accounted for in terms of a juncture-accent model, in which position of the segment within the syllable (i.e., onset, nucleus, coda), in tandem with knowledge of the associated stress-accent pattern, is used to interpret the inherently ambiguous phonetic information contained in the acoustic signal. Many properties of pronunciation variation can be accounted for in terms of such a model, including: (1) the prevalence of coda deletion, (2) the mutability of vocalic identity and (3) the relative stability of syllable onsets. The melding of phonetic and prosodic features within the syllable provides for efficient and reliable linguistic information coding.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.