An important part of understanding speech motor control consists of capturing the interaction between speech production and speech perception. This study tests a prediction of theoretical frameworks that have tried to account for these interactions: If speech production targets are specified in auditory terms, individuals with better auditory acuity should have more precise speech targets, evidenced by decreased within-phoneme variability and increased between-phoneme distance. A study was carried out consisting of perception and production tasks in counterbalanced order. Auditory acuity was assessed using an adaptive speech discrimination task, while production variability was determined using a pseudo-word reading task. Analyses of the production data were carried out to quantify average within-phoneme variability, as well as average between-phoneme contrasts. Results show that individuals not only vary in their production and perceptual abilities, but that better discriminators have more distinctive vowel production targets-that is, targets with less within-phoneme variability and greater between-phoneme distances-confirming the initial hypothesis. This association between speech production and perception did not depend on local phoneme density in vowel space. This study suggests that better auditory acuity leads to more precise speech production targets, which may be a consequence of auditory feedback affecting speech production over time.
When talking, speakers continuously monitor and use the auditory feedback of their own voice to control and inform speech production processes. When speakers are provided with auditory feedback that is perturbed in real time, most of them compensate for this by opposing the feedback perturbation. But some responses follow the perturbation. In the present study, we investigated whether the state of the speech production system at perturbation onset may determine what type of response (opposing or following) is made. The results suggest that whether a perturbation-related response is opposing or following depends on ongoing fluctuations of the production system: The system initially responds by doing the opposite of what it was doing. This effect and the nontrivial proportion of following responses suggest that current production models are inadequate: They need to account for why responses to unexpected sensory feedback depend on the production system's state at the time of perturbation.
Speaking is a complex motor skill which requires near instantaneous integration of sensory and motor-related information. Current theory hypothesizes a complex interplay between motor and auditory processes during speech production, involving the online comparison of the speech output with an internally generated forward model. To examine the neural correlates of this intricate interplay between sensory and motor processes, the current study uses altered auditory feedback (AAF) in combination with magnetoencephalography (MEG). Participants vocalized the vowel/e/and heard auditory feedback that was temporarily pitch-shifted by only 25 cents, while neural activity was recorded with MEG. As a control condition, participants also heard the recordings of the same auditory feedback that they heard in the first half of the experiment, now without vocalizing. The participants were not aware of any perturbation of the auditory feedback. We found auditory cortical areas responded more strongly to the pitch shifts during vocalization. In addition, auditory feedback perturbation resulted in spectral power increases in the θ and lower β bands, predominantly in sensorimotor areas. These results are in line with current models of speech production, suggesting auditory cortical areas are involved in an active comparison between a forward model's prediction and the actual sensory input. Subsequently, these areas interact with motor areas to generate a motor response. Furthermore, the results suggest that θ and β power increases support auditory-motor interaction, motor error detection and/or sensory prediction processing.
Previous research on the effect of perturbed auditory feedback in speech production has focused on two types of responses. In the short term, speakers generate compensatory motor commands in response to unexpected perturbations. In the longer term, speakers adapt feedforward motor programmes in response to feedback perturbations, to avoid future errors. The current study investigated the relation between these two types of responses to altered auditory feedback. Specifically, it was hypothesised that consistency in previous feedback perturbations would influence whether speakers adapt their feedforward motor programmes. In an altered auditory feedback paradigm, formant perturbations were applied either across all trials (the consistent condition) or only to some trials, whereas the others remained unperturbed (the inconsistent condition). The results showed that speakers’ responses were affected by feedback consistency, with stronger speech changes in the consistent condition compared with the inconsistent condition. Current models of speech-motor control can explain this consistency effect. However, the data also suggest that compensation and adaptation are distinct processes, which are not in line with all current models.
Speakers monitor auditory feedback during speech production in order to correct for speech errors. The comparator model proposes that this process is supported by comparing sensory feedback to internal predictions of the sensory consequences of articulation. Additionally, this comparison process is proposed to support the sense of agency over vocal output. The current study tests this hypothesis by asking whether mismatching auditory feedback leads to a decrease in the sense of agency as measured by speakers' responses to pitch-shifted feedback. Participants vocalized while auditory feedback was unexpectedly and briefly pitch-shifted. In addition, in one block, the entire vocalization's pitch was baseline-shifted ('alien voice'), while it was not in the other block ('normal voice'). Participants compensated for the pitch shifts even in the alien voice condition, suggesting that agency was flexible. This is problematic for the classic comparator model, where a mismatching feedback would lead to a loss of agency. Alternative models are discussed in light of these findings, including an adapted comparator model and the inferential account, which suggests that agency is inferred from the joint contribution of several multisensory sources of evidence. Together, these findings suggest that internal representations of one's own voice are more flexible than often assumed.
One of the most daunting tasks of a listener is to map a continuous auditory stream onto known speech sound categories and lexical items. A major issue with this mapping problem is the variability in the acoustic realizations of sound categories, both within and across speakers. Past research has suggested listeners may use visual information (e.g., lipreading) to calibrate these speech categories to the current speaker. Previous studies have focused on audiovisual recalibration of consonant categories. The present study explores whether vowel categorization, which is known to show less sharply defined category boundaries, also benefit from visual cues.Participants were exposed to videos of a speaker pronouncing one out of two vowels, paired with audio that was ambiguous between the two vowels. After exposure, it was found that participants had recalibrated their vowel categories. In addition, individual variability in audiovisual recalibration is discussed. It is suggested that listeners' category sharpness may be related to the weight they assign to visual information in audiovisual speech perception. Specifically, listeners with less sharp categories assign more weight to visual information during audiovisual speech recognition.
The role of auditory feedback in vocal production has mainly been investigated by altered auditory feedback (AAF) in real time. In response, speakers compensate by shifting their speech output in the opposite direction. Current theory suggests this is caused by a mismatch between expected and observed feedback. A methodological issue is the difficulty to fully isolate the speaker's hearing so that only AAF is presented to their ears. As a result, participants may be presented with two simultaneous signals. If this is true, an alternative explanation is that responses to AAF depend on the contrast between the manipulated and the non-manipulated feedback. This hypothesis was tested by varying the passive sound attenuation (PSA). Participants vocalized while auditory feedback was unexpectedly pitch shifted. The feedback was played through three pairs of headphones with varying amounts of PSA. The participants' responses were not affected by the different levels of PSA. This suggests that across all three headphones, PSA is either good enough to make the manipulated feedback dominant, or differences in PSA are too small to affect the contribution of non-manipulated feedback. Overall, the results suggest that it is important to realize that non-manipulated auditory feedback could affect responses to AAF.
Models of speech production explain event-related suppression of the auditory cortical response as reflecting a comparison between auditory predictions and feedback. The present MEG study was designed to test two predictions from this framework: (1) whether the reduced auditory response varies as a function of the mismatch between prediction and feedback; (2) whether individual variation in this response is predictive of speech-motor adaptation. Participants alternated between online imitation and listening tasks. In the imitation task, participants began each trial producing the same vowel (/e/) and subsequently listened to and imitated auditorily-presented vowels varying in acoustic distance from /e/. Results replicated suppression, with a smaller M100 during speaking than listening. Although we did not find unequivocal support for the first prediction, participants with less M100 suppression were better at the imitation task. These results are consistent with the enhancement of M100 serving as an error signal to drive subsequent speech-motor adaptation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.