In a naturalistic environment, auditory cues are often accompanied by information from other senses, which can be redundant with or complementary to the auditory information. Although the multisensory interactions derived from this combination of information and that shape auditory function are seen across all sensory modalities, our greatest body of knowledge to date centers on how vision influences audition. In this review, we attempt to capture the state of our understanding at this point in time regarding this topic. Following a general introduction, the review is divided into 5 sections. In the first section, we review the psychophysical evidence in humans regarding vision’s influence in audition, making the distinction between vision’s ability to enhance versus alter auditory performance and perception. Three examples are then described that serve to highlight vision’s ability to modulate auditory processes: spatial ventriloquism, cross-modal dynamic capture, and the McGurk effect. The final part of this section discusses models that have been built based on available psychophysical data and that seek to provide greater mechanistic insights into how vision can impact audition. The second section reviews the extant neuroimaging and far-field imaging work on this topic, with a strong emphasis on the roles of feedforward and feedback processes, on imaging insights into the causal nature of audiovisual interactions, and on the limitations of current imaging-based approaches. These limitations point to a greater need for machine-learning-based decoding approaches toward understanding how auditory representations are shaped by vision. The third section reviews the wealth of neuroanatomical and neurophysiological data from animal models that highlights audiovisual interactions at the neuronal and circuit level in both subcortical and cortical structures. It also speaks to the functional significance of audiovisual interactions for two critically important facets of auditory perception—scene analysis and communication. The fourth section presents current evidence for alterations in audiovisual processes in three clinical conditions: autism, schizophrenia, and sensorineural hearing loss. These changes in audiovisual interactions are postulated to have cascading effects on higher-order domains of dysfunction in these conditions. The final section highlights ongoing work seeking to leverage our knowledge of audiovisual interactions to develop better remediation approaches to these sensory-based disorders, founded in concepts of perceptual plasticity in which vision has been shown to have the capacity to facilitate auditory learning.
Processes governing the creation, perception and production of spoken words are sensitive to the patterns of speech sounds in the language user’s lexicon. Generative linguistic theory suggests that listeners infer constraints on possible sound patterning from the lexicon and apply these constraints to all aspects of word use. In contrast, emergentist accounts suggest that these phonotactic constraints are a product of interactive associative mapping with items in the lexicon. To determine the degree to which phonotactic constraints are lexically mediated, we observed the effects of learning new words that violate English phonotactic constraints (e.g., srigin) on phonotactic perceptual repair processes in nonword consonant-consonant-vowel (CCV) stimuli (e.g., /sre/). Subjects who learned such words were less likely to “repair” illegal onset clusters (/sr/) and report them as legal ones (/∫r/). Effective connectivity analyses of MRI-constrained reconstructions of simultaneously collected magnetoencephalography (MEG) and EEG data showed that these behavioral shifts were accompanied by changes in the strength of influences of lexical areas on acoustic-phonetic areas. These results strengthen the interpretation of previous results suggesting that phonotactic constraints on perception are produced by top-down lexical influences on speech processing.
Temporal envelope fluctuations are abundant in nature and are critical for perception of complex sounds. While psychophysical sinusoidal amplitude modulation (SAM) processing studies have characterized the perception of SAM, and neurophysiological studies report a subcortical transformation from temporal to rate-based code, no studies have characterized this transformation in unanesthetized animals or in nonhuman primates. To address this, we recorded single-unit responses and compared derived neurometric measures in the cochlear nucleus (CN) and inferior colliculus (IC) to psychometric measures of modulation frequency (MF) discrimination in macaques. IC and CN neurons often exhibited tuned responses to SAM in their rate and spike-timing. Neurometric thresholds spanned a large range (2-200 Hz Δ MF). The lowest 40% of IC thresholds were less than or equal to psychometric thresholds, regardless of which code was used, while CN thresholds were greater than psychometric thresholds. Discrimination at 10-20 Hz could be explained by indiscriminately pooling 30 units in either structure, while discrimination at higher MFs was best explained by more selective pooling. This suggests that pooled brainstem activity was sufficient for AM discrimination. Psychometric and neurometric thresholds decreased as a function of stimulus duration, but IC and CN thresholds were greater and more variable than behavior at durations less than 500 ms. This slower subcortical temporal integration compared to behavior was consistent with a drift diffusion model which reproduced individual differences in performance and can constrain future neurophysiological studies of temporal integration. These measures provide an account of AM perception at the neurophysiological, computational, and behavioral levels.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.