The study of the static and dynamic aspects of speech production can profit from technologies such as electromagnetic midsagittal articulography (EMA) and real-time magnetic resonance (RTMRI). These can improve our knowledge on which articulators and gestures are involved in producing specific sounds and foster improved speech production models, paramount to advance, e.g., articulatory speech synthesis. Previous work, by the authors, has shown that critical articulator identification could be performed from RTMRI data of the vocal tract, with encouraging results, by extending the applicability of an unsupervised statistical identification method previously proposed for EMA data. Nevertheless, the slower time resolution of the considered RT-MRI corpus (14 Hz), when compared to EMA, potentially influencing the ability to select the most suitable representative configuration for each phone-paramount for strongly dynamic phones, e.g., nasal vowels-, and the lack of a richer set of contexts-relevant for observing coarticulation effects-, were identified as limitations. This article addresses these limitations by exploring critical articulator identification from a faster RTMRI corpus (50 Hz), for European Portuguese, providing a richer set of contexts, and testing how fusing the articulatory data of two speakers might influence critical articulator determination.
Thi s paper investigates similarities between lexical consonant clusters and CVC sequences differing in the presence or absence of a lexical vowel in speech perception and production in two Portuguese varieties. The frequent high vowel deletion in the European variety (EP) and the realization of intervening vocalic elements between lexical clusters in Brazilian Portuguese (BP) may minimize the contrast between lexical clusters and CVC sequences in the two Portuguese varieties. In order to test this hypothesis we present a perception experiment with 72 participants and a physiological analysis of 3-dimensional movement data from 5 EP and 4 BP speakers. The perceptual results confirmed a gradual confusion of lexical clusters and CVC sequences in EP, which corresponded roughly to the gradient consonantal overlap found in production.
The characterisation of nasal vowels is not only a question of studying velar aperture. Recent work shows that oropharyngeal articulatory adjustments enhance the acoustics of nasal coupling or, at least, magnify differences between oral/nasal vowel congeners. Despite preliminary studies on the oral configurations of nasal vowels, for European Portuguese, a quantitative analysis is missing, particularly one to be applied systematically to a desirably large number of speakers. The main objective of this study is to adapt and extend previous methodological advances for the analysis of MRI data to further investigate: how velar changes affect oral configurations; the changes to the articulators and constrictions when compared with oral counterparts; and the closest oral counterpart. High framerate RT-MRI images (50fps) are automatically processed to extract the vocal tract contours and the position/configuration for the different articulators. These data are processed by evolving a quantitative articulatory analysis framework, previously proposed by the authors, extended to include information regarding constrictions (degree and place) and nasal port. For this study, while the analysis of data for more speakers is ongoing, we considered a set of two EP native speakers and addressed the study of oral and nasal vowels mainly in the context of stop consonants.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.