During spoken language comprehension listeners transform continuous acoustic cues into categories (e.g. /b/ and /p/). While longstanding research suggests that phonetic categories are activated in a gradient way, there are also clear individual differences in that more gradient categorization has been linked to various communication impairments like dyslexia and specific language impairments (Joanisse, Manis, Keating, & Seidenberg, 2000; López-Zamora, Luque, Álvarez, & Cobos, 2012; Serniclaes, Van Heghe, Mousty, Carré, & Sprenger-Charolles, 2004; Werker & Tees, 1987). Crucially, most studies have used two-alternative forced choice (2AFC) tasks to measure the sharpness of between-category boundaries. Here we propose an alternative paradigm that allows us to measure categorization gradiency in a more direct way. Furthermore, we follow an individual differences approach to: (a) link this measure of gradiency to multiple cue integration, (b) explore its relationship to a set of other cognitive processes, and (c) evaluate its role in individuals’ ability to perceive speech in noise. Our results provide validation for this new method of assessing phoneme categorization gradiency and offer preliminary insights into how different aspects of speech perception may be linked to each other and to more general cognitive processes.
This tutorial analyzes voice onset time (VOT) data from Dongbei (Northeastern) Mandarin Chinese and North American English to demonstrate how Bayesian linear mixed models can be fit using the programming language Stan via the R package brms. Through this case study, we demonstrate some of the advantages of the Bayesian framework: researchers can (i) flexibly define the underlying process that they believe to have generated the data; (ii) obtain direct information regarding the uncertainty about the parameter that relates the data to the theoretical question being studied; and (iii) incorporate prior knowledge into the analysis. Getting started with Bayesian modeling can be challenging, especially when one is trying to model one's own (often unique) data. It is difficult to see how one can apply general principles described in textbooks to one's own specific research problem. We address this barrier to using Bayesian methods by providing three detailed examples, with source code to allow easy reproducibility. The examples presented are intended to give the reader a flavor of the process of model-fitting; suggestions for further study are also provided. All data and code are available from: https://osf.io/g4zpv.
This study examined individual differences in categorical perception and the use of multiple acoustic cues in the perception of the stop voicing contrast. Goals were to investigate whether gradiency of speech perception was related to listeners' differential sensitivity to acoustic cues and to individual differences in executive function. The experiment included two speech perception tasks (visual analogue scaling [VAS] and anticipatory eye movement [AEM]) administered to 30 English-speaking adults in two separate experimental sessions. Stimuli were a /ta/ to /da/ continuum that systematically varied VOT and f0. Findings were that some listeners had a more gradient pattern of responses on the VAS task; the listeners who had a gradient response pattern on the VAS task also showed more sensitivity to f0 on the AEM task. The patterns were consistent across individuals tested on two separate occasions. These results suggest that variability in how categorically listeners perceive speech sounds is consistent and systematic within individuals.
Transcription-based studies have shown that tense stops appear before aspirated or lax stops in most Korean-acquiring children's speech. This order of mastery is predicted by the short lag Voice Onset Time (VOT) values of Korean tense stops, as this is the earliest acquired phonation type across languages. However, the tense stop also has greater motor demands than the other two phonation types, given its pressed voice quality (negative H1-H2) and its relatively high f0 value at vowel onset, word-initially. In order to explain the observed order of mastery of Korean stops, we need a more sensitive quantitative model of the role of multiple acoustic parameters in production and perception. This study explores the relationship between native speakers' transcriptions/categorizations of children's stop productions and three acoustic characteristics (VOT, H1-H2 and f0). The results showed that the primary acoustic parameter that adult listeners used to differentiate tense vs. non-tense stops was VOT. Listeners used VOT and the additional acoustic parameter of f0 to differentiate lax vs. aspirated stops. Thus, the early acquisition of tense stops is explained both by their short-lag VOT values and the fact that children need to learn to control only a single acoustic parameter to produce them. Keywords tense stop; Voice Onset Time; fundamental frequency (f0); transcription accuracy; Korean stop laryngeal contrast; phonological acquisition IntroductionOne of the most noteworthy achievements of modern phonetics is our understanding of how phonation categories map onto voice onset time (VOT) cross-linguistically. VOT is a continuous measure of the temporal relationship between two acoustic events that signal the © 2011 Elsevier Ltd. All rights reserved * Corresponding author: ekong@wisc.edu, Tel: +1 608 262 6768, Fax: +1 608 263 0529 e-mail addresses of other authors mbeckman@ling.osu.edu; jedwards2@wisc.edu. Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. NIH Public AccessAuthor Manuscript J Phon. Author manuscript; available in PMC 2012 April 1. NIH-PA Author ManuscriptNIH-PA Author Manuscript NIH-PA Author Manuscript onset of vocal fold vibration and the release of the oral constriction, but the three qualitatively different VOT relationships of "before" versus "simultaneous with" versus "well after" capture the three most commonly attested phonation categories across languages. For example, in Lisker & Abramson's (1964) seminal investigation of stop VOT distributions across eleven languages, the two stop phonation types in Dutch, Hungarian, Spanish and Tamil could be ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.