To interact with objects in complex environments, we must know what they are and where they are in spite of challenging viewing conditions. Here, we investigated where, how and when representations of object location and category emerge in the human brain when objects appear on cluttered natural scene images using a combination of functional magnetic resonance imaging, electroencephalography and computational models. We found location representations to emerge along the ventral visual stream towards lateral occipital complex, mirrored by gradual emergence in deep neural networks. Time-resolved analysis suggested that computing object location representations involves recurrent processing in high-level visual cortex. Object category representations also emerged gradually along the ventral visual stream, with evidence for recurrent computations. These results resolve the spatiotemporal dynamics of the ventral visual stream that give rise to representations of where and what objects are present in a scene under challenging viewing conditions.
Grasping the meaning of everyday visual events is a fundamental feat of human intelligence that hinges on diverse neural processes ranging from vision to higher-level cognition. Deciphering the neural basis of visual event understanding requires rich, extensive, and appropriately designed experimental data. However, this type of data is hitherto missing. To fill this gap, we introduce the BOLD Moments Dataset (BMD), a large dataset of whole-brain fMRI responses to over 1,000 short (3s) naturalistic video clips and accompanying metadata. We show visual events interface with an array of processes, extending even to memory, and we reveal a match in hierarchical processing between brains and video-computable deep neural networks. Furthermore, we showcase that BMD successfully captures temporal dynamics of visual events at second resolution. BMD thus establishes a critical groundwork for investigations of the neural basis of visual event understanding.
1.AbstractSpatial attention helps us to efficiently localize objects in cluttered environments. However, the processing stage at which spatial attention modulates object location representations remains unclear. Here we investigated this question identifying processing stages in time and space in an EEG and fMRI experiment respectively. As both object location representations and attentional effects have been shown to depend on the background on which objects appear, we included object background as an experimental factor. During the experiments, human participants viewed images of objects appearing in different locations on blank or cluttered backgrounds while either performing a task on fixation or on the periphery to direct their covert spatial attention away or towards the objects. We used multivariate classification to assess object location information. Consistent across the EEG and fMRI experiment, we show that spatial attention modulated location representations during late processing stages (>150ms, in middle and high ventral visual stream areas) independent of background condition. Our results clarify the processing stage at which attention modulates object location representations in the ventral visual stream and show that attentional modulation is a cognitive process separate from recurrent processes related to the processing of objects on cluttered backgrounds.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.