In natural conversation, turns are handed off quickly, with the mean downtime commonly ranging from 7 to 423 ms. To achieve this, speakers plan their upcoming speech as their partner’s turn unfolds, holding the audible utterance in abeyance until socially appropriate. The role played by prediction is debated, with some researchers claiming that speakers predict upcoming speech opportunities, and others claiming that speakers wait for detection of turn-final cues. The dynamics of articulatory triggering may speak to this debate. It is often assumed that the prepared utterance is held in a response buffer and then initiated all at once. This assumption is consistent with standard phonetic models in which articulatory actions must follow tightly prescribed patterns of coordination. This assumption has recently been challenged by single-word production experiments in which participants partly positioned their articulators to anticipate upcoming utterances, long before starting the acoustic response. The present study considered whether similar anticipatory postures arise when speakers in conversation await their next opportunity to speak. We analyzed a pre-existing audiovisual database of dyads engaging in unstructured conversation. Video motion tracking was used to determine speakers’ lip areas over time. When utterance-initial syllables began with labial consonants or included rounded vowels, speakers produced distinctly smaller lip areas (compared to other utterances), prior to audible speech. This effect was moderated by the number of words in the upcoming utterance; postures arose up to 3,000 ms before acoustic onset for short utterances of 1–3 words. We discuss the implications for models of conversation and phonetic control.
The timing and coordination of articulatory movements is essential to producing speech. A considerable body of literature explores the cognitive and motor control mechanisms involved. This review provides
In the form preparation task, participants verbally produce words in small sets, which either overlap on an early phonological fragment or contrast on that fragment. A canonical account of word-form encoding assumes a sequential phonological encoding phase necessarily preceding subsequent retrieval of a discrete phonetic motor plan. This account assumes that acoustic onset and speech onset are equivalent, and that speech onset never precedes complete processing of the stimulus. In two form preparation experiments, we examined the influence of anticipatory processes on preacoustic lip articulation. We used motion-tracked digital video to measure continuous changes in vertical lip aperture. In sets with initial segment overlap, participants configured their lips to anticipate upcoming aerodynamic demands, before stimulus presentation. Anticipatory posturing arose even when initial segment was only 75% certain. These findings appear inconsistent with extant speech models that assume ballistic execution of a fully encoded, certainly known response.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.