Human observers combine multiple sensory cues synergistically to achieve greater perceptual sensitivity, but little is known about the underlying neuronal mechanisms. We recorded from neurons in the dorsal medial superior temporal area (MSTd) during a task in which trained monkeys combine visual and vestibular cues near optimally to discriminate heading. During bimodal stimulation, MSTd neurons combine visual and vestibular inputs linearly with sub-additive weights. Neurons with congruent heading preferences for visual and vestibular stimuli show improvements in sensitivity that parallel behavioral effects. In contrast, neurons with opposite preferences show diminished sensitivity under cue combination. Responses of congruent cells are more strongly correlated with monkeys' perceptual decisions than opposite cells, suggesting that the animal monitors the activity of congruent cells to a greater extent during cue integration. These findings demonstrate perceptual cue integration in non-human primates and identify a population of neurons that may form its neural basis.Understanding how the brain combines different sources of sensory information to optimize perception is a fundamental problem in neuroscience. Information from different sensory modalities is often seamlessly integrated into a unified percept. Combining sensory inputs leads to improved behavioral performance in many contexts, including integration of texture and motion cues for depth perception 1 , stereo and texture cues for slant perception 2,3 , visual-haptic integration 4,5 , visual-auditory localization 6 , and object recognition 7 . Multisensory integration in human behavior often follows predictions of a quantitative framework that applies Bayesian statistical inference to the problem of cue integration 8-10 . An important prediction is that subjects show greater perceptual sensitivity when two cues are presented together than when either cue is presented alone. This improvement in sensitivity is largest (a factor of √2) when the two cues have equal reliability 5,10 .Despite intense recent interest in cue integration, the underlying neural mechanisms remain unclear. Improved perceptual performance during cue integration is thought to be mediated by neurons selective for multiple sensory stimuli 11 . Multi-modal neurons have been described in several brain areas 12,13 , but these studies have typically been performed in anesthetized or passively viewing animals 14-17 . Multi-modal neurons have not been studied during
Robust perception of self-motion requires integration of visual motion signals with nonvisual cues. Neurons in the dorsal subdivision of the medial superior temporal area (MSTd) may be involved in this sensory integration, because they respond selectively to global patterns of optic flow, as well as translational motion in darkness. Using a virtual-reality system, we have characterized the three-dimensional (3D) tuning of MSTd neurons to heading directions defined by optic flow alone, inertial motion alone, and congruent combinations of the two cues. Among 255 MSTd neurons, 98% exhibited significant 3D heading tuning in response to optic flow, whereas 64% were selective for heading defined by inertial motion. Heading preferences for visual and inertial motion could be aligned but were just as frequently opposite. Moreover, heading selectivity in response to congruent visual/vestibular stimulation was typically weaker than that obtained using optic flow alone, and heading preferences under congruent stimulation were dominated by the visual input. Thus, MSTd neurons generally did not integrate visual and nonvisual cues to achieve better heading selectivity. A simple two-layer neural network, which received eye-centered visual inputs and head-centered vestibular inputs, reproduced the major features of the MSTd data. The network was trained to compute heading in a head-centered reference frame under all stimulus conditions, such that it performed a selective reference-frame transformation of visual, but not vestibular, signals. The similarity between network hidden units and MSTd neurons suggests that MSTd may be an early stage of sensory convergence involved in transforming optic flow information into a (head-centered) reference frame that facilitates integration with vestibular signals.
Elegant sensory structures in the inner ear have evolved to measure head motion. These vestibular receptors consist of highly conserved semicircular canals and otolith organs. Unlike other senses, vestibular information in the central nervous system becomes immediately multisensory and multimodal. There is no overt, readily recognizable conscious sensation from these organs, yet vestibular signals contribute to a surprising range of brain functions, from the most automatic reflexes to spatial perception and motor coordination. Critical to these diverse, multimodal functions are multiple computationally intriguing levels of processing. For example, the need for multisensory integration necessitates vestibular representations in multiple reference frames. Proprioceptive-vestibular interactions, coupled with corollary discharge of a motor plan, allow the brain to distinguish actively generated from passive head movements. Finally, nonlinear interactions between otolith and canal signals allow the vestibular system to function as an inertial sensor and contribute critically to both navigation and spatial orientation.
The perception of self-motion direction, or heading, relies on integration of multiple sensory cues, especially from the visual and vestibular systems. However, the reliability of sensory information can vary rapidly and unpredictably, and it remains unclear how the brain integrates multiple sensory signals given this dynamic uncertainty. Human psychophysical studies have shown that observers combine cues by weighting them in proportion to their reliability, consistent with statistically optimal integration schemes derived from Bayesian probability theory. Remarkably, because cue reliability is varied randomly across trials, the perceptual weight assigned to each cue must change from trial to trial. Dynamic cue reweighting has not been examined for combinations of visual and vestibular cues, nor has the Bayesian cue integration approach been applied to laboratory animals, an important step toward understanding the neural basis of cue integration. To address these issues, we tested human and monkey subjects in a heading discrimination task involving visual (optic flow) and vestibular (translational motion) cues. The cues were placed in conflict on a subset of trials, and their relative reliability was varied to assess the weights that subjects gave to each cue in their heading judgments. We found that monkeys can rapidly reweight visual and vestibular cues according to their reliability, the first such demonstration in a nonhuman species. However, some monkeys and humans tended to over-weight vestibular cues, inconsistent with simple predictions of a Bayesian model. Nonetheless, our findings establish a robust model system for studying the neural mechanisms of dynamic cue reweighting in multisensory perception.
Integration of multiple sensory cues is essential for precise and accurate perception and behavioral performance, yet the reliability of sensory signals can vary across modalities and viewing conditions. Human observers typically employ the optimal strategy of weighting each cue in proportion to its reliability, but the neural basis of this computation remains poorly understood. We trained monkeys to perform a heading discrimination task from visual and vestibular cues, varying cue reliability at random. Monkeys appropriately placed greater weight on the more reliable cue, and population decoding of neural responses in area MSTd nicely predicted behavioral cue weighting, including modest deviations from optimality. We further show that the mathematical combination of visual and vestibular inputs by single neurons is generally consistent with recent theories of optimal probabilistic computation in neural circuits. These results provide direct evidence for a neural mechanism mediating a simple and widespread form of statistical inference.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.