Humans unconsciously track a wide array of distributional characteristics in their sensory environment. Recent research in spoken-language processing has demonstrated that the speech rate surrounding a target region within an utterance influences which words, and how many words, listeners hear later in that utterance. On the basis of hypotheses that listeners track timing information in speech over long timescales, we investigated the possibility that the perception of words is sensitive to speech rate over such a timescale (e.g., an extended conversation). Results demonstrated that listeners tracked variation in the overall pace of speech over an extended duration (analogous to that of a conversation that listeners might have outside the lab) and that this global speech rate influenced which words listeners reported hearing. The effects of speech rate became stronger over time. Our findings are consistent with the hypothesis that neural entrainment by speech occurs on multiple timescales, some lasting more than an hour.
A defining trait of linguistic competence is the ability to combine elements into increasingly complex structures to denote, and to comprehend, a potentially infinite number of meanings. Recent magnetoencephalography (MEG) work has investigated these processes by comparing the response to nouns in combinatorial (blue car) and non-combinatorial (rnsh car) contexts. In the current study we extended this paradigm using electroencephalography (EEG) to dissociate the role of semantic content from phonological well-formedness (yerl car). We used event-related potential (ERP) recordings in order to better relate the observed neurophysiological correlates of basic combinatorial operations to prior ERP work on comprehension. We found that nouns in combinatorial contexts (blue car) elicited a greater centro-parietal negativity between 180-400ms, independent of the phonological well-formedness of the context word. We discuss the potential relationship between this ‘combinatorial’ effect and classic N400 effects. We also report preliminary evidence for an early anterior negative deflection immediately preceding the critical noun in combinatorial contexts, which we tentatively interpret as an electrophysiological reflex of syntactic structure initialization.
What structural properties do language and music share? Although early speculation identified a wide variety of possibilities, the literature has largely focused on the parallels between musical structure and syntactic structure. Here, we argue that parallels between musical structure and prosodic structure deserve more attention. We review the evidence for a link between musical and prosodic structure and find it to be strong. In fact, certain elements of prosodic structure may provide a parsimonious comparison with musical structure without sacrificing empirical findings related to the parallels between language and music. We then develop several predictions related to such a hypothesis.
During lexical access, listeners use both signalbased and knowledge-based cues, and information from the linguistic context can affect the perception of acoustic speech information. Recent findings suggest that the various cues used in lexical access are implemented with flexibility and may be affected by information from the larger speech context. We conducted 2 experiments to examine effects of a signal-based cue (distal speech rate) and a knowledge-based cue (linguistic structure) on lexical perception. In Experiment 1, we manipulated distal speech rate in utterances where an acoustically ambiguous critical word was either obligatory for the utterance to be syntactically well formed (e.g., Conner knew that bread and butter (are) both in the pantry) or optional (e.g., Don must see the harbor (or) boats). In Experiment 2, we examined identical target utterances as in Experiment 1 but changed the distribution of linguistic structures in the fillers. The results of the 2 experiments demonstrate that speech rate and linguistic knowledge about critical word obligatoriness can both influence speech perception. In addition, it is possible to alter the strength of a signal-based cue by changing information in the speech environment. These results provide support for models of word segmentation that include flexible weighting of signal-based and knowledge-based cues.
Listeners must adapt to differences in speech rate across talkers and situations. Speech rate adaptation effects are strong for adjacent syllables (i.e., proximal syllables). For studies that have assessed adaptation effects on speech rate information more than one syllable removed from a point of ambiguity in speech (i.e., distal syllables), the difference in strength between different types of ambiguity is stark. Studies of word segmentation have shown large shifts in perception as a result of distal rate manipulations, while studies of segmental perception have shown only weak, or even nonexistent, effects. However, no study has standardized methods and materials to study context effects for both types of ambiguity simultaneously. Here, a set of sentences was created that differed as minimally as possible except for whether the sentences were ambiguous to the voicing of a consonant or ambiguous to the location of a word boundary. The sentences were then rate-modified to slow down the distal context speech rate to various extents, dependent on three different definitions of distal context that were adapted from previous experiments, along with a manipulation of proximal context to assess whether proximal effects were comparable across ambiguity types. The results indicate that the definition of distal influenced the extent of distal rate effects strongly for both segments and segmentation. They also establish the presence of distal rate effects on word-final segments for the first time. These results were replicated, with some caveats regarding the perception of individual segments, in an Internet-based sample recruited from Mechanical Turk.
Under the autosegmental-metrical (AM) theory of intonation, the temporal alignment of fundamental frequency (F0) patterns with respect to syllables has been claimed to distinguish pitch accent categories. Several experiments test whether differences in F0 peak or valley alignment in American English phrases would produce evidence consistent with a change from (1) a H* to a H+L* pitch accent, and (2) a L* to a L+H* pitch accent. Four stimulus series were constructed in which F0 peak or valley alignment was shifted across portions of short phrases with varying stress. In Experiment 1, participants discriminated pairs of stimuli in an AX task. In Experiment 2, participants classified stimuli as category exemplars using an AXB task. In Experiment 3, participants imitated stimuli; the alignment of F0 peaks and valleys in their productions was measured. Finally, in Experiment 4, participants judged the relative prominence of initial and final syllables in stimuli to determine whether alignment differences generated a stress shift. The results support the distinctions between H* and H+L* and between L+H* and L*. Moreover, evidence consistent with an additional category not currently predicted by most AM theories was obtained, which is proposed here to be H*+H. The results have implications for understanding phonological contrasts, phonetic interpolation in English intonation, and the transcription of prosodic contrasts in corpus-based analysis.
Purpose Individuals vary in their ability to learn the sound categories of nonnative languages (nonnative phonetic learning) and to adapt to systematic differences, such as accent or talker differences, in the sounds of their native language (native phonetic learning). Difficulties with both native and nonnative learning are well attested in people with speech and language disorders relative to healthy controls, but substantial variability in these skills is also present in the typical population. This study examines whether this individual variability can be organized around a common ability that we label “phonetic plasticity.” Method A group of healthy young adult participants ( N = 80), who attested they had no history of speech, language, neurological, or hearing deficits, completed two tasks of nonnative phonetic category learning, two tasks of learning to cope with variation in their native language, and seven tasks of other cognitive functions, distributed across two sessions. Performance on these 11 tasks was compared, and exploratory factor analysis was used to assess the extent to which performance on each task was related to the others. Results Performance on both tasks of native learning and an explicit task of nonnative learning patterned together, suggesting that native and nonnative phonetic learning tasks rely on a shared underlying capacity, which is termed “phonetic plasticity.” Phonetic plasticity was also associated with vocabulary, comprehension of words in background noise, and, more weakly, working memory. Conclusions Nonnative sound learning and native language speech perception may rely on shared phonetic plasticity. The results suggest that good learners of native language phonetic variation are also good learners of nonnative phonetic contrasts. Supplemental Material https://doi.org/10.23641/asha.16606778
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.