In crowding, perception of a target is strongly deteriorated by nearby elements. Crowding is often explained by pooling models predicting that adding flankers increases crowding. In contrast, the centroid hypothesis proposes that adding flankers decreases crowding--"bigger is better." In foveal vision, we have recently shown that adding flankers can increase or decrease crowding depending on whether the target groups or ungroups from the flankers. We have further shown how configural effects, such as good and global Gestalt, determine crowding. Foveal and peripheral crowding do not always reveal the same characteristics. Here, we show that the very same grouping and Gestalt results of foveal vision are also found in the periphery. These results can neither be explained by simple pooling nor by centroid models. We discuss when bigger is better and how grouping might shape crowding.
In crowding, the perception of a target strongly deteriorates when neighboring elements are presented. Crowding is usually assumed to have the following characteristics. (a) Crowding is determined only by nearby elements within a restricted region around the target (Bouma's law). (b) Increasing the number of flankers can only deteriorate performance. (c) Target-flanker interference is feature-specific. These characteristics are usually explained by pooling models, which are well in the spirit of classic models of object recognition. In this review, we summarize recent findings showing that crowding is not determined by the above characteristics, thus, challenging most models of crowding. We propose that the spatial configuration across the entire visual field determines crowding. Only when one understands how all elements of a visual scene group with each other, can one determine crowding strength. We put forward the hypothesis that appearance (i.e., how stimuli look) is a good predictor for crowding, because both crowding and appearance reflect the output of recurrent processing rather than interactions during the initial phase of visual processing.
In object recognition, features are thought to be processed in a hierarchical fashion from low-level analysis (edges and lines) to complex figural processing (shapes and objects). Here, we show that figural processing determines low-level processing. Vernier offset discrimination strongly deteriorated when we embedded a vernier in a square. This is a classic crowding effect. Surprisingly, crowding almost disappeared when additional squares were added. We propose that figural interactions between the squares precede low-level suppression of the vernier by the single square, contrary to hierarchical models of object recognition.
Observers perceive objects in the world as stable over space and time, even though the visual experience of those objects is often discontinuous and distorted due to masking, occlusion, camouflage, or noise. How are we able to easily and quickly achieve stable perception in spite of this constantly changing visual input? It was previously shown that observers experience serial dependence in the perception of features and objects, an effect that extends up to 15 seconds back in time. Here, we asked whether the visual system utilizes an object's prior physical location to inform future position assignments in order to maximize location stability of an object over time. To test this, we presented subjects with small targets at random angular locations relative to central fixation in the peripheral visual field. Subjects reported the perceived location of the target on each trial by adjusting a cursor's position to match its location. Subjects made consistent errors when reporting the perceived position of the target on the current trial, mislocalizing it toward the position of the target in the preceding two trials (Experiment 1). This pull in position perception occurred even when a response was not required on the previous trial (Experiment 2). In addition, we show that serial dependence in perceived position occurs immediately after stimulus presentation, and it is a fast stabilization mechanism that does not require a delay (Experiment 3). This indicates that serial dependence occurs for position representations and facilitates the stable perception of objects in space. Taken together with previous work, our results show that serial dependence occurs at many stages of visual processing, from initial position assignment to object categorization.
We are continuously surrounded by a noisy and ever-changing environment. Instead of analyzing all the elements in a scene, our visual system has the ability to compress an enormous amount of visual information into ensemble representations, such as perceiving a forest instead of every single tree. Still, it is unclear why such complex scenes appear to be the same from moment to moment despite fluctuations, noise, and discontinuities in retinal images. The general effects of change blindness are usually thought to stabilize scene perception, making us unaware of minor inconsistencies between scenes. Here, we propose an alternative, that stable scene perception is actively achieved by the visual system through global serial dependencies: the appearance of scene gist is sequentially dependent on the gist perceived in previous moments. To test this hypothesis, we used summary statistical information as a proxy for “gist” level, global information in a scene. We found evidence for serial dependence in summary statistical representations. Furthermore, we show that this kind of serial dependence occurs at the ensemble level, where local elements are already merged into global representations. Taken together, our results provide a mechanism through which serial dependence can promote the apparent consistency of scenes over time.
In everyday life, we are constantly surrounded by complex and cluttered scenes. In such cluttered environments, visual perception is primarily limited by crowding, the deleterious influence of nearby objects on object recognition. For the past several decades, visual crowding was assumed to occur at a single stage, only between low-level features or object parts, thus dismantling, destroying, or filtering object information. A large and converging body of evidence has demonstrated that this assumption is false: crowding occurs at multiple stages of visual analysis, and information passes through crowding at each of these stages. This converging empirical evidence points to a seeming paradox: crowding happens at multiple levels, which would seem to impair object recognition, and yet visual information at each of those levels is maintained intact and influences subsequent higher-level visual processing. Thus, while crowding impairs the access we have to visual information at many levels, it does not impair the representation of that information. The resolution of this paradox reveals how the visual system strikes a balance between the limits of object selection and the desire to represent multiple levels of visual information throughout cluttered scenes. Understanding crowding is therefore key to resolving the relationship between the richness of object and scene representations and the limits of conscious object recognition.
Individuals can quickly and effortlessly recognize facial expressions, which is critical for social perception and emotion regulation. This sensitivity to even slight facial changes could result in unstable percepts of an individual's expression over time. The visual system must therefore balance accuracy with maintaining perceptual stability. However, previous research has focused on our sensitivity to changing expressions, and the mechanism behind expression stability remains an open question. Recent results demonstrate that perception of facial identity is systematically biased toward recently seen visual input. This positive perceptual pull, or serial dependence, may help stabilize perceived expression. To test this, observers judged random facial expression morphs ranging from happy to sad to angry. We found a pull in perceived expression toward previously seen expressions, but only when the 1-back and current face had similar identities. Our results are consistent with the existence of the continuity field for expression, a specialized mechanism that promotes the stability of emotion perception, which could help facilitate social interactions and emotion regulation.
Investigations of visual crowding, where a target is difficult to identify because of flanking elements, has largely used a theoretical perspective based on local interactions where flanking elements pool with or substitute for properties of the target. This successful theoretical approach has motivated a wide variety of empirical investigations to identify mechanisms that cause crowding, and it has suggested practical applications to mitigate crowding effects. However, this theoretical approach has been unable to account for a parallel set of findings that crowding is influenced by long-range perceptual grouping effects. When the target and flankers are perceived as part of separate visual groups, crowding tends to be quite weak. Here, we describe how theoretical mechanisms for grouping and segmentation in cortical neural circuits can account for a wide variety of these long-range grouping effects. Building on previous work, we explain how crowding occurs in the model and explain how grouping in the model involves connected boundary signals that represent a key aspect of visual information. We then introduce new circuits that allow nonspecific top-down selection signals to flow along connected boundaries or within a surface contained by boundaries and thereby induce a segmentation that can separate the visual information corresponding to the flankers from the visual information corresponding to the target. When such segmentation occurs, crowding is shown to be weak. We compare the model's behavior to 5 sets of experimental findings on visual crowding and show that the model does a good job explaining the key empirical findings. (PsycINFO Database Record
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.