Humans can effortlessly categorize objects, both when they are conveyed through visual images and spoken words. To resolve the neural correlates of object categorization, studies have so far primarily focused on the visual modality. It is therefore still unclear how the brain extracts categorical information from auditory signals. In the current study we used EEG (N=48) and time-resolved multivariate pattern analysis to investigate (1) the time course with which object category information emerges in the auditory modality and (2) how the representational transition from individual object identification to category representation compares between the auditory modality and the visual modality. Our results show that (1) that auditory object category representations can be reliably extracted from EEG signals and (2) a similar representational transition occurs in the visual and auditory modalities, where an initial representation at the individual-object level is followed by a subsequent representation of the objects' category membership. Altogether, our results suggest an analogous hierarchy of information processing across sensory channels. However, there was no convergence towards conceptual modality-independent representations, thus providing no evidence for a shared supra-modal code.
Humans effortlessly make quick and accurate perceptual decisions about the nature of their immediate visual environment, such as the category of the scene they face. Previous research has revealed a rich set of cortical representations potentially underlying this feat. However, it remains unknown which of these representations are suitably formatted for decision-making. Here, we approached this question empirically and computationally, using neuroimaging and computational modelling. For the empirical part, we collected electroencephalography (EEG) data and reaction times from human participants during a scene categorization task (natural vs. man-made). We then related neural representations to behaviour using a multivariate extension of signal detection theory. We observed a correlation specifically between ~100 ms and ~200 ms after stimulus onset, suggesting that the neural scene representations in this time period are suitably formatted for decision-making. For the computational part, we evaluated a recurrent convolutional neural network (RCNN) as a model of brain and behaviour. Unifying our previous observations in an image-computable model, the RCNN predicted well the neural representations, the behavioural scene categorization data, as well as the relationship between them. Our results identify and computationally characterize the neural and behavioural correlates of scene categorization in humans.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.