Neuroimaging has revealed a core network of cortical regions that contribute to speech production, but the functional organization of this network remains poorly understood. Purpose We describe efforts to identify reliable boundaries around functionally homogenous regions within the cortical speech motor control network in order to improve the sensitivity of functional magnetic resonance imaging (fMRI) analyses of speech production and thus improve our understanding of the functional organization of speech production in the brain. Method We used a bottom-up, data-driven approach by pooling data from 12 previously conducted fMRI studies of speech production involving the production of monosyllabic and bisyllabic words and pseudowords that ranged from single vowels and consonant–vowel pairs to short sentences (163 scanning sessions, 136 unique participants, 39 different speech conditions). After preprocessing all data through the same pipeline and registering individual contrast maps to a common surface space, hierarchical clustering was applied to contrast maps randomly sampled from the pooled data set in order to identify consistent functional boundaries across subjects and tasks. Boundary completion was achieved by applying adaptive smoothing and watershed segmentation to the thresholded population-level boundary map. Hierarchical clustering was applied to the mean within–functional region of interest (fROI) response to identify networks of fROIs that respond similarly during speech. Results We identified highly reliable functional boundaries across the cortical areas involved in speech production. Boundary completion resulted in 117 fROIs in the left hemisphere and 109 in the right hemisphere. Clustering of the mean within-fROI response revealed a core sensorimotor network flanked by a speech motor planning network. The majority of the left inferior frontal gyrus clustered with the visual word form area and brain regions (e.g., anterior insula, dorsal anterior cingulate) associated with detecting salient sensory inputs and choosing the appropriate action. Conclusion The fROIs provide insight into the organization of the speech production network and a valuable tool for studying speech production in the brain by improving within-group and between-groups comparisons of speech-related brain activity. Supplemental Material https://doi.org/10.23641/asha.9402674
Tongue surface measurements from midsagittal ultrasound scans are effectively arcs with deviations representing tongue shape, but smoothing-spline analysis of variances (SSANOVAs) assume variance around a horizontal line. Therefore, calculating SSANOVA average curves of tongue traces in Cartesian Coordinates [Davidson, J. Acoust. Soc. Am. 120(1), 407-415 (2006)] creates errors that are compounded at tongue tip and root where average tongue shape deviates most from a horizontal line. This paper introduces a method for transforming data into polar coordinates similar to the technique by Mielke [J. Acoust. Soc. Am. 137(5), 2858-2869 (2015)], but using the virtual origin of a radial ultrasound transducer as the polar origin-allowing data conversion in a manner that is robust against between-subject and between-session variability.
This paper investigates the articulation of approximant /ɹ/ in New Zealand English (NZE), and tests whether the patterns documented for rhotic varieties of English hold in a non-rhotic dialect. Midsagittal ultrasound data for 62 speakers producing 13 tokens of /ɹ/ in various phonetic environments were categorized according to the taxonomy by Delattre & Freeman (1968), and semi-automatically traced and quantified using the AAA software (Articulate Instruments Ltd. 2012) and a Modified Curvature Index (MCI; Dawson, Tiede & Whalen 2016). Twenty-five NZE speakers produced tip-down /ɹ/ exclusively, 12 tip-up /ɹ/ exclusively, and 25 produced both, partially depending on context. Those speakers who produced both variants used the most tip-down /ɹ/ in front vowel contexts, the most tip-up /ɹ/ in back vowel contexts, and varying rates in low central vowel contexts. The NZE speakers produced tip-up /ɹ/ most often in word-initial position, followed by intervocalic, then coronal, and least often in velar contexts. The results indicate that the allophonic variation patterns of /ɹ/ in NZE are similar to those of American English (Mielke, Baker & Archangeli 2010, 2016). We show that MCI values can be used to facilitate /ɹ/ gesture classification; linear mixed-effects models fit on the MCI values of manually categorized tongue contours show significant differences between all but two of Delattre & Freeman's (1968) tongue types. Overall, the results support theories of modular speech motor control with articulation strategies evolving from local rather than global optimization processes, and a mechanical model of rhotic variation (see Stavness et al. 2012).
This mini review is aimed at a clinician-scientist seeking to understand the role of oscillations in neural processing and their functional relevance in speech and music perception. We present an overview of neural oscillations, methods used to study them, and their functional relevance with respect to music processing, aging, hearing loss, and disorders affecting speech and language. We first review the oscillatory frequency bands and their associations with speech and music processing. Next we describe commonly used metrics for quantifying neural oscillations, briefly touching upon the still-debated mechanisms underpinning oscillatory alignment. Following this, we highlight key findings from research on neural oscillations in speech and music perception, as well as contributions of this work to our understanding of disordered perception in clinical populations. Finally, we conclude with a look toward the future of oscillatory research in speech and music perception, including promising methods and potential avenues for future work. We note that the intention of this mini review is not to systematically review all literature on cortical tracking of speech and music. Rather, we seek to provide the clinician-scientist with foundational information that can be used to evaluate and design research studies targeting the functional role of oscillations in speech and music processing in typical and clinical populations.
This paper presents the findings of an ultrasound study of 10 New Zealand English and 10 Tongan-speaking trombone players, to determine whether there is an influence of native language speech production on trombone performance. Trombone players’ midsagittal tongue shapes were recorded while reading wordlists and during sustained note productions, and tongue surface contours traced. After normalizing to account for differences in vocal tract shape and ultrasound transducer orientation, we used generalized additive mixed models (GAMMs) to estimate average tongue surface shapes used by the players from the two language groups when producing notes at different pitches and intensities, and during the production of the monophthongs in their native languages. The average midsagittal tongue contours predicted by our models show a statistically robust difference at the back of the tongue distinguishing the two groups, where the New Zealand English players display an overall more retracted tongue position; however, tongue shape during playing does not directly map onto vowel tongue shapes as prescribed by the pedagogical literature. While the New Zealand English-speaking participants employed a playing tongue shape approximating schwa and the vowel used in the word ‘lot,’ the Tongan participants used a tongue shape loosely patterning with the back vowels /o/ and /u/. We argue that these findings represent evidence for native language influence on brass instrument performance; however, this influence seems to be secondary to more basic constraints of brass playing related to airflow requirements and acoustical considerations, with the vocal tract configurations observed across both groups satisfying these conditions in different ways. Our findings furthermore provide evidence for the functional independence of various sections of the tongue and indicate that speech production, itself an acquired motor skill, can influence another skilled behavior via motor memory of vocal tract gestures forming the basis of local optimization processes to arrive at a suitable tongue shape for sustained note production.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.