Consciousness is now a well-established field of empirical research. A large body of experimental results has been accumulated and is steadily growing. In parallel, many Theories of Consciousness (ToCs) have been proposed. These theories are diverse in nature, ranging from computational to neurophysiological and quantum theoretical approaches. This contrasts with other fields of natural science, which host a smaller number of competing theories. We suggest that one reason for this abundance of extremely different theories may be the lack of stringent criteria specifying how empirical data constrains ToCs. First, we argue that consciousness is a well-defined topic from an empirical point of view and motivate a purely empirical stance on the quest for consciousness. Second, we present a checklist of criteria that, we propose, empirical ToCs need to cope with. Third, we review 13 of the most influential ToCs and subject them to the criteria. Our analysis helps to situate these different ToCs in the theoretical landscapeand sheds light on their strengths and weaknesses from a strictly empirical point of view.
Conscious perception seems to be a continuous stream of percepts. Is this true? Recent research sheds new light on this age-old debate.In long-lasting postdictive effects, later events can determine the perception of events that occurred several hundreds of milliseconds earlier.Long-lasting postdiction requires high capacity buffers, which store information unconsciously for substantial periods of time. This favors a two-stage model, in which continuous unconscious processing precedes discrete conscious percepts.Such a two-stage model solves the problems of both traditional continuous and discrete models.
In crowding, perception of an object deteriorates in the presence of nearby elements. Although crowding is a ubiquitous phenomenon, since elements are rarely seen in isolation, to date there exists no consensus on how to model it. Previous experiments showed that the global configuration of the entire stimulus must be taken into account. These findings rule out simple pooling or substitution models and favor models sensitive to global spatial aspects. In order to investigate how to incorporate global aspects into models, we tested a large number of models with a database of forty stimuli tailored for the global aspects of crowding. Our results show that incorporating grouping like components strongly improves model performance.
Sensory information must be integrated over time to perceive, for example, motion and melodies. Here, to study temporal integration, we used the sequential metacontrast paradigm in which two expanding streams of lines are presented. When a line in one stream is offset observers perceive all other lines to be offset too, even though they are straight. When more lines are offset the offsets integrate mandatorily, i.e., observers cannot report the individual offsets. We show that mandatory integration lasts for up to 450 ms, depending on the observer. Importantly, integration occurs only when offsets are presented within a discrete window of time. Even stimuli that are in close spatio-temporal proximity do not integrate if they are in different windows. A window of integration starts with stimulus onset and integration in the next window has similar characteristics. We present a two-stage computational model based on discrete time windows that captures these effects.
Feedforward Convolutional Neural Networks (ffCNNs) have become state-of-the-art models both in computer vision and neuroscience. However, human-like performance of ffCNNs does not necessarily imply human-like computations. Previous studies have suggested that current ffCNNs do not make use of global shape information. However, it is currently unclear whether this reflects fundamental differences between ffCNN and human processing or is merely an artefact of how ffCNNs are trained. Here, we use visual crowding as a well-controlled, specific probe to test global shape computations. Our results provide evidence that ffCNNs cannot produce human-like global shape computations for principled architectural reasons. We lay out approaches that may address shortcomings of ffCNNs to provide better models of the human visual system. texture. This training dataset biased an ffCNN (ResNet50; He, Zhang, Ren, & Sun, 2016) towards shape-level features, because textural information was no longer useful for classifying this dataset. They validated the network's shape-bias by showing increased robustness to local noise and textural changes. Alternatively, ffCNNs may be incapable of matching human global computations for principled architectural reasons. Even though Geirhos et al.'s network was able to ignore local features, it may not use global computations in the same way as humans. One difficulty in addressing this question is that there is no consensus about how to experimentally diagnose how deep networks compute global information.
Classically, visual processing is described as a cascade of local feedforward computations. Feedforward Convolutional Neural Networks (ffCNNs) have shown how powerful such models can be. However, using visual crowding as a well-controlled challenge, we previously showed that no classic model of vision, including ffCNNs, can explain human global shape processing. Here, we show that Capsule Neural Networks (CapsNets), combining ffCNNs with recurrent grouping and segmentation, solve this challenge. We also show that ffCNNs and standard recurrent CNNs do not, suggesting that the grouping and segmentation capabilities of CapsNets are crucial. Furthermore, we provide psychophysical evidence that grouping and segmentation are implemented recurrently in humans, and show that Caps-Nets reproduce these results well. We discuss why recurrence seems needed to implement grouping and segmentation efficiently. Together, we provide mutually reinforcing psychophysical and computational evidence that a recurrent grouping and segmentation process is essential to understand the visual system and create better models that harness global shape computations.
In crowding, the perception of an object deteriorates in the presence of nearby elements. Obviously, crowding is a ubiquitous phenomenon, as elements are rarely seen in isolation. One of the main characteristics of crowding is that the elements themselves are not rendered invisible, but their features are averaged[1] or substituted[2] with those of neighboring elements. Recently, Harrison and Bex [3] presented "A Unifying Model of Orientation Crowding in Peripheral Vision", which elegantly explains these two characteristics of crowding with one unifying mechanism. They tested their model using a new crowding paradigm and demonstrated an excellent match between human and model results. A key prediction of their model is that a higher number of flankers leads to stronger crowding, simply because more non-target features contribute to the model's output and thus deteriorate performance. However, several recent studies have shown that increasing the number of flankers can actually improve performance [4-9]. Using the same experimental design as Harrison and Bex [3], we report here that adding more flankers can also improve performance in their paradigm, whereas their model predicts the opposite result. We propose that a truly unified model of crowding must include a grouping stage.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.