This study investigated articulatory variations in tongue positions and shapes during the production of Mandarin post-alveolar retroflex consonants produced by speakers with varying Mandarin proficiency (native Mandarin speakers, Japanese L2 speakers of Mandarin, and Japanese monolinguals with no knowledge of Mandarin). We aimed to examine (1) whether there are articulatory variations for Mandarin retroflex consonants and (2) if preferred variations differ across groups. Speakers either read aloud or imitated the consonants after hearing them and their tongue positions and shapes were measured by electromagnetic articulography (WAVE, NDI). Results showed that there are multiple articulatory variations for Mandarin retroflex consonants. Native Mandarin speakers produced a concave or convex tongue shape; all Japanese L2 speakers of Mandarin produced a convex tongue shape. In contrast, the majority of Japanese monolinguals imitated the sounds with an entirely different tongue position: bunching their tongues in the middle, which somewhat resembled the “bunched” rhotic in American English. Despite these articulatory variations, productions by most L2 speakers and several monolinguals were successfully identified as retroflex consonants by native Mandarin listeners. These results suggest that L2 speakers may prefer certain articulatory variations and the preference may change depending on proficiency.
Previous studies on multimodal integration in speech perception have found that not only auditory and visual cues, but also tactical sensation—such as an air-puff on skin that simulates aspiration—can be integrated in the perception of speech sounds (Gick & Derrick, 2009). However, most previous investigations have been conducted with English listeners, and it remains uncertain whether such multisensory integration can be shaped by linguistic experience. The current study investigates audio-aerotactile integration in phoneme perception for three groups: English, French monolingual and English-French bilingual listeners. Six step VOT continua of labial (/ba/—/pa/) and alveolar (/da/—/ta/) stops constructed from both English and French endpoint models were presented to listeners who performed a forced-choice identification task. Air-puffs synchronized to syllable onset and applied to the hand at random increased the number of ‘voiceless’ responses for the /da/—/ta/ continuum by both English and French listeners, which suggests that audio-aerotactile integration can occur even though some of the listeners did not have aspiration/non-aspiration contrasts in their native language. Furthermore, bilingual speakers showed larger air-puff effects for English stimuli compared to English monolinguals, which suggests a complex relationship between linguistic experience and multisensory integration in perception.
Lip positions for a novel rounded vowel seemed to be produced as a modification of existing lip positions from the native repertoire. Moreover, the degree of vertical aperture might be easily transferred, and the degree of protrusion is less likely to be retained in the new lip positions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.