This paper investigates whether compensation for coarticulation in speech perception can be mediated by native language. Substantial work has studied compensation as a consequence of aspects of general auditory processing or as a consequence of a perceptual gestural recovery processes. The role of linguistic experience in compensation for coarticulation potentially cross-cuts this controversy and may shed light on the phonetic basis of compensation. In Experiment 1, French and English native listeners identified an initial sound from a set of fricative-vowel syllables on a contiuum from [s] to [S] with the vowels [a,u,y]. French speakers are familiar with the round vowel [y], while it is unfamiliar to English speakers. Both groups showed compensation (a shifted 's'/'sh' boundary compared with [a]) for the vowel [u], but only the French-speaking listeners reliably compensated for the vowel [y]. In Experiment 2, twenty-four American English listeners judged videos in which the audio stimuli of Experiment 1 were used as soundtracks of a face saying [s]V, [S]V, or a visual-blend of the two fricatives. The study found that videos with [S] visual information induced significantly more "S" responses than did those made from visual [s] tokens. However, as in Experiment 1, English-speaking listeners reliably compensated for [u], but not for the unfamiliar vowel [y]. The listeners used visual consonant information for categorization, but did not use visual vowel information for compensation for coarticulation. The results indicate that perceptual compensation for coarticulation is a language specific effect tied to the listener's experience with the conditioning phonetic environment.
Pronunciation variation in many ways is systematic, yielding patterns that a canny listener can exploit in order to aid perception. This work asks whether listeners actually do draw upon these patterns during speech perception. We focus in particular on a phenomenon known as paradigmatic enhancement, in which suffixes are phonetically enhanced in verbs which are frequent in their inflectional paradigms. In a set of four experiments, we found that listeners do not seem to attend to paradigmatic enhancement patterns. They do, however, attend to the distributional properties of a verb’s inflectional paradigm when the experimental task encourages attention to sublexical detail, as is the case with phoneme monitoring (Experiment 1a–b). When tasks require more holistic lexical processing, as with lexical decision (Experiment 2), the effect of paradigmatic probability disappears. If stimuli are presented in full sentences, such that the surrounding context provides richer contextual and semantic information (Experiment 3), even otherwise robust influences like lexical frequency disappear. We propose that these findings are consistent with a perceptual system that is flexible, and devotes processing resources to exploiting only those patterns that provide a sufficient cognitive return on investment.
Relationship between perceptual accuracy and information measures: A cross-linguistic study by Shinae Kang Doctor of Philosophy in LinguisticsUniversity of California, BerkeleyProfessor Keith A. Johnson, ChairThe current dissertation studies how the information conveyed by different speech elements of English, Japanese and Korean correlates with perceptual accuracy. Two wellestablished information measures are used: weighted negative contextual predictability (informativity) of a speech element; and the contribution of a speech element to syllable differentiation, or functional load. This dissertation finds that the correlation between information and perceptual accuracy differs depending on both the type of information measure and the language of the listener.To compute the information measures, Chapter 2 introduces a new corpus consisting of all the possible syllables for each of the three languages. The chapter shows that the two information measures are inversely correlated.In Chapters 3 and 4, two perception experiments in audio-only and audiovisual modalities, respectively, are described. The experiments adopt a forced-choice-identification paradigm. In both experiments, subjects listened to VC.CV stimuli composed of spliced VC and CV chunks, and they were asked to identify the CC sequence. Multiple statistical models are constructed to predict the perceptual accuracy of a CC stop cluster from the associated information measures of relevant speech elements in the listeners' languages.The estimated models show that high informativity has a generally negative effect on the perceptual accuracy of stops. Functional load shows less consistent correlations with perceptual accuracy across different model specifications, but generally has a positive effect on the perceptual accuracy.In addition, Japanese listeners show highly consistent results across the two experiments and different model specifications. This contrasts with less consistent results for English and Korean listeners.The current dissertation provides the first empirical evidence for a significant relationship between informativity and functional load, and between the information measures of speech elements and their perceptual accuracy. Furthermore, it reveals how listeners' native 2 languages affect that relationship, and how the cross-linguistic variation of that relationship may be related to the characteristics of individual languages such as their syllable structures.
Listeners can shift their attention to different sizes of speech during speech perception. This study extends this claim and investigates if linguistic structure affects this attention. Since English has a larger syllable inventory than Korean and Japanese, each phoneme plays a larger functional role. Also, listeners have different levels of phonological awareness due to the differences in the orthography. We focus on the effect of perceptual attention on the perceptibility of intervocalic consonant clusters (VCCV) and whether it varies cross-linguistically by these structural factors. We first recorded eight talkers saying VC- and CV-syllables and spliced the syllables to create non-overlapping VC.CV-stimuli. Listeners in three language groups (English/Korean/Japanese) participated in a 9-Alternative-Forced-Choice perception task. They identified the CC as one of 9 alternatives (“pt”, “pk”, “pp”, etc.) and in an attention-manipulated condition did the same task while also monitoring for target talkers. The preliminary result shows that Korean listeners showed less perceptual sensitivity to clusters than English listeners. Also, the English listeners showed better perception of syllable coda when prompted to focus on coda only. The result indicates that the linguistic structure of a language can potentially affect the level of perceptual attention that its users give to a linguistic unit.
Functional load (FL) is an information-theoretic measure that captures a phoneme's contribution to successful word identification. Experimental findings have shown that it can help explain patterns in perceptual accuracy. Here, we ask whether the relationship between FL and perception has larger consequences for the structure of a language's lexicon. Since reducing FL minimizes the risk of misidentifying a word in the case where a listener inaccurately perceives the initial phoneme, we predicted that in spoken language, where perceptual accuracy is important for successful communication, the lexicon will be structured to reduce FL in auditorily confusable initial phonemes more than in written language. To test this prediction, we compared FL of all initial phonemes in spoken and academic written genres of the COCA corpus. We found that FL in phoneme pairs in the spoken corpus is overall higher and more variable than in the academic corpus, a natural consequence of the smaller lexical inventory characteristic of spoken language. In auditorily confusable pairs, however, this difference is relatively reduced, such that spoken FL decreases relative to academic FL. We argue that this reflects a pressure in spoken language to use words for which inaccurate perception does minimal damage to word identification.
ForewordThe ENGG 3100 proceedings are a collection of papers written by undergraduate students enrolled in the ENGG3100 Design III course offered by the School of Engineering, University of Guelph during the Winter 2007 term. The Design III course is the third in a four courses design sequence that all students studying Engineering at Guelph must take regardless of their Engineering speciality. The course prepares students for open ended design projects by guiding them through the design process using an active learning approach. Each student works as part of a group of 4 students on one of several pre-selected design projects. These projects typically cover all of the Engineering fields offered by the School of Engineering. Students are engaged in the design process using lectures, weekly meetings with highly experienced teaching assistants and course instructors. Students also get feedback on their design after submitting two design reports for evaluation and making two presentations to the entire class.In the Winter 2007 offering, we decided for the first time to require students to write a short paper that describes their design at the end of the course to be collected and published in an annual proceedings. The papers were reviewed and feedback was given to students to help them prepare a second copy. These copies are what is published in this proceedings. This proceeding is also part of the active learning approach to learning design skills. Writing a short paper to describe an engineering design is an important skill that has many benefits. Innovative designs are typically presented in technical conferences and/or industry trade shows where a short and well written description of the design is often required. In other cases, patents are filed to secure the intellectual property rights of inventors. In these cases, the inventor must provide a short document that examines the current state of the art and provides an introduction to the invention. Finally, students who are interested in graduate studies will be able to explore one of the most common forms of academic publishing. The format and style of these papers follows the guidelines for articles published by the Institue of Electronic and Electrical Engineering (IEEE).
This study investigates how visual phonetic information affects compensation for coarticulation in speech perception. A series of CV syllables with fricative continuum from [s] to [sh] before [a],[u] and [y] was overlaid with a video of a face saying [s]V, [∫]V, or a visual blend of the two fricatives. We made separate movies for each vowel environment. We collected [s]/[∫] boundary locations from 24 native English speakers. In a test of audio-visual integration, [∫] videos showed significantly lower boundary locations (more [sh] responses) than [s] videos (t[23]=2.9, p<0.01) in the [a] vowel environment. Regardless of visual fricative condition, the participants showed a compensation effect with [u] (t[23] > 3, p<0.01), but not with the unfamiliar vowel [y]. This pattern of results was similar to our findings from an audio-only version of the experiment, implying that the compensation effect was not strengthened by seeing the lip rounding of [y].
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.