To form a veridical percept of the environment, the brain needs to integrate sensory signals from a common source but segregate those from independent sources. Thus, perception inherently relies on solving the “causal inference problem.” Behaviorally, humans solve this problem optimally as predicted by Bayesian Causal Inference; yet, the underlying neural mechanisms are unexplored. Combining psychophysics, Bayesian modeling, functional magnetic resonance imaging (fMRI), and multivariate decoding in an audiovisual spatial localization task, we demonstrate that Bayesian Causal Inference is performed by a hierarchy of multisensory processes in the human brain. At the bottom of the hierarchy, in auditory and visual areas, location is represented on the basis that the two signals are generated by independent sources (= segregation). At the next stage, in posterior intraparietal sulcus, location is estimated under the assumption that the two signals are from a common source (= forced fusion). Only at the top of the hierarchy, in anterior intraparietal sulcus, the uncertainty about the causal structure of the world is taken into account and sensory signals are combined as predicted by Bayesian Causal Inference. Characterizing the computational operations of signal interactions reveals the hierarchical nature of multisensory perception in human neocortex. It unravels how the brain accomplishes Bayesian Causal Inference, a statistical computation fundamental for perception and cognition. Our results demonstrate how the brain combines information in the face of uncertainty about the underlying causal structure of the world.
To obtain a coherent percept of the environment, the brain should integrate sensory signals from common sources and segregate those from independent sources. Recent research has demonstrated that humans integrate audiovisual information during spatial localization consistent with Bayesian Causal Inference (CI). However, the decision strategies that human observers employ for implicit and explicit CI remain unclear. Further, despite the key role of sensory reliability in multisensory integration, Bayesian CI has never been evaluated across a wide range of sensory reliabilities. This psychophysics study presented participants with spatially congruent and discrepant audiovisual signals at four levels of visual reliability. Participants localized the auditory signals (implicit CI) and judged whether auditory and visual signals came from common or independent sources (explicit CI). Our results demonstrate that humans employ model averaging as a decision strategy for implicit CI; they report an auditory spatial estimate that averages the spatial estimates under the two causal structures weighted by their posterior probabilities. Likewise, they explicitly infer a common source during the common-source judgment when the posterior probability for a common source exceeds a fixed threshold of 0.5. Critically, sensory reliability shapes multisensory integration in Bayesian CI via two distinct mechanisms: First, higher sensory reliability sensitizes humans to spatial disparity and thereby sharpens their multisensory integration window. Second, sensory reliability determines the relative signal weights in multisensory integration under the assumption of a common source. In conclusion, our results demonstrate that Bayesian CI is fundamental for integrating signals of variable reliabilities.
Human observers typically integrate sensory signals in a statistically optimal fashion into a coherent percept by weighting them in proportion to their reliabilities. An emerging debate in neuroscience is to which extent multisensory integration emerges already in primary sensory areas or is deferred to higher-order association areas. This fMRI study used multivariate pattern decoding to characterize the computational principles that define how auditory and visual signals are integrated into spatial representations across the cortical hierarchy. Our results reveal small multisensory influences that were limited to a spatial window of integration in primary sensory areas. By contrast, parietal cortices integrated signals weighted by their sensory reliabilities and task relevance in line with behavioral performance and principles of statistical optimality. Intriguingly, audiovisual integration in parietal cortices was attenuated for large spatial disparities when signals were unlikely to originate from a common source. Our results demonstrate that multisensory interactions in primary and association cortices are governed by distinct computational principles. In primary visual cortices, spatial disparity controlled the influence of non-visual signals on the formation of spatial representations, whereas in parietal cortices, it determined the influence of task-irrelevant signals. Critically, only parietal cortices integrated signals weighted by their bottom-up reliabilities and top-down task relevance into multisensory spatial priority maps to guide spatial orienting.
Transforming the barrage of sensory signals into a coherent multisensory percept relies on solving the binding problem – deciding whether signals come from a common cause and should be integrated or, instead, segregated. Human observers typically arbitrate between integration and segregation consistent with Bayesian Causal Inference, but the neural mechanisms remain poorly understood. Here, we presented people with audiovisual sequences that varied in the number of flashes and beeps, then combined Bayesian modelling and EEG representational similarity analyses. Our data suggest that the brain initially represents the number of flashes and beeps independently. Later, it computes their numbers by averaging the forced-fusion and segregation estimates weighted by the probabilities of common and independent cause models (i.e. model averaging). Crucially, prestimulus oscillatory alpha power and phase correlate with observers’ prior beliefs about the world’s causal structure that guide their arbitration between sensory integration and segregation.
Behaviorally, it is well established that human observers integrate signals near-optimally weighted in proportion to their reliabilities as predicted by maximum likelihood estimation. Yet, despite abundant behavioral evidence, it is unclear how the human brain accomplishes this feat. In a spatial ventriloquist paradigm, participants were presented with auditory, visual, and audiovisual signals and reported the location of the auditory or the visual signal. Combining psychophysics, multivariate functional MRI (fMRI) decoding, and models of maximum likelihood estimation (MLE), we characterized the computational operations underlying audiovisual integration at distinct cortical levels. We estimated observers’ behavioral weights by fitting psychometric functions to participants’ localization responses. Likewise, we estimated the neural weights by fitting neurometric functions to spatial locations decoded from regional fMRI activation patterns. Our results demonstrate that low-level auditory and visual areas encode predominantly the spatial location of the signal component of a region’s preferred auditory (or visual) modality. By contrast, intraparietal sulcus forms spatial representations by integrating auditory and visual signals weighted by their reliabilities. Critically, the neural and behavioral weights and the variance of the spatial representations depended not only on the sensory reliabilities as predicted by the MLE model but also on participants’ modality-specific attention and report (i.e., visual vs. auditory). These results suggest that audiovisual integration is not exclusively determined by bottom-up sensory reliabilities. Instead, modality-specific attention and report can flexibly modulate how intraparietal sulcus integrates sensory signals into spatial representations to guide behavioral responses (e.g., localization and orienting).
Emotions can be aroused by various kinds of stimulus modalities. Recent neuroimaging studies indicate that several brain regions represent emotions at an abstract level, i.e., independently from the sensory cues from which they are perceived (e.g., face, body, or voice stimuli). If emotions are indeed represented at such an abstract level, then these abstract representations should also be activated by the memory of an emotional event. We tested this hypothesis by asking human participants to learn associations between emotional stimuli (videos of faces or bodies) and non-emotional stimuli (fractals). After successful learning, fMRI signals were recorded during the presentations of emotional stimuli and emotion-associated fractals. We tested whether emotions could be decoded from fMRI signals evoked by the fractal stimuli using a classifier trained on the responses to the emotional stimuli (and vice versa). This was implemented as a whole-brain searchlight, multivoxel activation pattern analysis, which revealed successful emotion decoding in four brain regions: posterior cingulate cortex (PCC), precuneus, MPFC, and angular gyrus. The same analysis run only on responses to emotional stimuli revealed clusters in PCC, precuneus, and MPFC. Multidimensional scaling analysis of the activation patterns revealed clear clustering of responses by emotion across stimulus types. Our results suggest that PCC, precuneus, and MPFC contain representations of emotions that can be evoked by stimuli that carry emotional information themselves or by stimuli that evoke memories of emotional stimuli, while angular gyrus is more likely to take part in emotional memory retrieval.
The representation of reward anticipation and reward prediction errors is the basis for reward-associated learning. The representation of whether or not a reward occurred (reward receipt) is important for decision making. Recent studies suggest that, while reward anticipation and reward prediction errors are encoded in the midbrain and the ventral striatum, reward receipts are encoded in the medial orbitofrontal cortex. In order to substantiate this functional specialization we analyzed data from an fMRI study in which 59 subjects completed two simple monetary reward paradigms. Because reward receipts and reward prediction errors were correlated, a statistical model comparison was applied separating the effects of the two. Reward prediction error fitted BOLD responses significantly better than reward receipt in the midbrain and the ventral striatum. Conversely, reward receipt fitted BOLD responses better in the orbitofrontal cortex. Activation related to reward anticipation was found in the orbitofrontal cortex. The results confirm a functional specialization of behaviorally important aspects of reward processing within the mesolimbic dopaminergic system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.