Zhe-chen Guo scite author profile

2019

Lang Speech

Experience with native-language prosody encourages language-specific strategies for speech segmentation. Conflicting findings from previous research suggest that these strategies may not be abstracted away from the acoustic manifestation of prosodic features in the native speech. Using the artificial language learning paradigm, the current study explores this possibility in connection with listeners of a lexical tone language called Taiwanese Southern Min (TSM). In TSM, the only rising lexical tone occurs almost only on the final syllable of the language’s tone sandhi domain and is phonetically associated with final lengthening. Based on these observations, Experiment I examined what constituted a sufficient finality cue for use by TSM listeners to support segmentation: (a) final fundamental frequency (F0) rise only; or (b) final F0 rise conjoined with final lengthening. The results showed that segmentation was inhibited by the former cue but facilitated by the latter. Experiment II showed that the facilitation cannot be attributed entirely to final lengthening, as a null effect was found when final lengthening was the sole prosodic cue to segmentation. It is thus assumed that acoustic details as fine-grained as the lengthening of the rising tone are involved in the modulation of the segmentation strategy whereby TSM listeners perceive F0 rise as signaling finality. The inhibitory effect of final F0 rise alone found in Experiment I motivated Experiment III, which revealed that initial F0 rise in the absence of lengthening cues improved TSM listeners’ segmentation. It is speculated that such use of initial F0 rise might reflect a cross-linguistic segmentation solution.

Perception of gesturally different Mandarin retroflexes by Taiwan Mandarin speakers

Guo¹

2021

Preprint

This paper reports results from a study investigating whether there is a perceptual difference between gesturally different Mandarin retroflexes. Previous studies have suggested that there are two articulatory manners for Mandarin retroflexes: One involves the tongue tip being “curled-up,” and the other the tongue body being “bunched-up.” Thus, by implementing a perception test on Taiwan Mandarin listeners and an acoustic analysis, the research determines whether retroflexes produced with these gestures will be perceived differently. The resultsdings then show that “curled-up” and “bunched-up” retroflexes are not perceptually contrastive at a phonological level. However, the latter are perceived to be phonetically more retroflexed, with such property of stronger retroflexion reflected in their lower M1 (first moment) values. These findings yield one pedagogical implication. The teaching of retroflex articulations can be made reference to the gesture with which Mandarin learners can produce with more ease.

Speaking clearly improves speech segmentation by statistical learning under optimal listening conditions

Smiljanić

2021

This study investigated the effect of speaking style on speech segmentation by statistical learning under optimal and adverse listening conditions. Similar to the intelligibility and memory benefits found in previous studies, enhanced acoustic-phonetic cues of the listener-oriented clear speech could improve speech segmentation by statistical learning compared to conversational speech. Yet, it could not be precluded that hyper-articulated clear speech, reported to have less pervasive coarticulation, would result in worse segmentation than conversational speech. We tested these predictions using an artificial language learning paradigm. Listeners who acquired English before age six were played continuous repetitions of the 'words' of an artificial language, spoken either clearly or conversationally and presented either in quiet or in noise at a signal-to-noise ratio of +3 or 0 dB SPL. Next, they recognized the artificial words in a two-alternative forced-choice test. Results supported the prediction that clear speech facilitates segmentation by statistical learning more than conversational speech but only in the quiet listening condition. This suggests that listeners can use clear speech acoustic-phonetic enhancements to guide speech processing dependent on domain-general, signal-independent statistical computations. However, there was no clear speech benefit in noise at either signal-to-noise ratio. We discuss possible mechanisms that could explain these results.

The use of tonal coarticulation in segmentation of artificial language speech: A study with Mandarin listeners

Applied Psycholinguistics

2021

Tonal carryover assimilation, whereby a tone is assimilated to the preceding one, is conditioned by prosodic boundaries in a way suggesting that its presence may signal continuity or lack of a boundary. Its possibility as a speech segmentation cue was investigated in two artificial language (AL) learning experiments. Mandarin-speaking listeners identified the “words” of a three-tone AL (e.g., [pé.tī.kù]) after listening to six long speech streams in which the words were repeated continuously without pauses. The first experiment revealed that segmentation was disrupted in an “incongruent-cues” condition where tonal carryover assimilation occurred across AL word boundaries and conflicted with statistical regularities in the speech streams. Segmentation was neither facilitated nor inhibited in a “congruent-cues” condition where tonal carryover assimilation occurred only within the AL words in 27% of the repetitions and never across word boundaries. A null effect was again found for the congruent-cues condition of the second experiment, where all AL word repetitions carried tonal carryover assimilation. These findings show that tonal carryover assimilation is exploited to resolve segmentation problems when cues conflict. Its null effect in the congruent-cues conditions might be linked to cue redundancy and suggest that it is weighted low in the segmentation cue hierarchy.

Tonal carryover assimilation is exploited as a speech segmentation cue when cues conflict

2020

Tonal carryover assimilation, whereby a tone is phonetically assimilated to the preceding one, is widely observed across tone languages. This tonal coarticulatory effect is stronger across a smaller prosodic boundary (Lai and Kuang, 2016), suggesting that it may be a speech segmentation cue. The possibility was investigated in two artificial language (AL) learning experiments. Mandarin-speaking participants listened to long utterances in which tokens of the “words” of a three-tone AL (e.g., [pé.tț.kù]) were concatenated without pauses and then identified the words in a test. The first experiment revealed that segmentation was disrupted in an “incongruent-cues” condition where tonal carryover assimilation occurred across AL words in conflict with statistical regularities. Segmentation was neither improved nor inhibited in a “congruent-cues” condition where tonal carryover assimilation occurred within AL words but in only 27% of the word tokens. A follow-up experiment that included a similar congruent-cues condition but maximized the number of cue-bearing word tokens still found a null effect. It is concluded that tonal carryover assimilation is exploited to segment speech in the case of cue incongruence. Yet, it seems redundant when it agrees with statistical regularities, possibly because it is weighted low in the segmentation cue hierarchy (Mattys et al., 2005).