2020
DOI: 10.1016/j.visres.2019.12.006
|View full text |Cite
|
Sign up to set email alerts
|

Crowding reveals fundamental differences in local vs. global processing in humans and machines

Abstract: Feedforward Convolutional Neural Networks (ffCNNs) have become state-of-the-art models both in computer vision and neuroscience. However, human-like performance of ffCNNs does not necessarily imply human-like computations. Previous studies have suggested that current ffCNNs do not make use of global shape information. However, it is currently unclear whether this reflects fundamental differences between ffCNN and human processing or is merely an artefact of how ffCNNs are trained. Here, we use visual crowding … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
31
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 49 publications
(32 citation statements)
references
References 40 publications
1
31
0
Order By: Relevance
“…Hence, our results show that, given adequate priors, CapsNets explain uncrowding. We have shown that ffCNNs and CNNs with lateral or top-down recurrent connections do not produce uncrowding, even when they are trained identically on groups of identical shapes and successfully learn on the training data, comparably to the CapsNets (furthermore, we showed previously that ffCNNs trained on large datasets, which are often used as general models of vision, do not show uncrowding either; [ 17 ]). This shows that merely training networks on groups of identical shapes is not sufficient to explain uncrowding.…”
Section: Discussionmentioning
confidence: 95%
See 3 more Smart Citations
“…Hence, our results show that, given adequate priors, CapsNets explain uncrowding. We have shown that ffCNNs and CNNs with lateral or top-down recurrent connections do not produce uncrowding, even when they are trained identically on groups of identical shapes and successfully learn on the training data, comparably to the CapsNets (furthermore, we showed previously that ffCNNs trained on large datasets, which are often used as general models of vision, do not show uncrowding either; [ 17 ]). This shows that merely training networks on groups of identical shapes is not sufficient to explain uncrowding.…”
Section: Discussionmentioning
confidence: 95%
“…In previous work, we have shown that pretrained ffCNNs cannot explain uncrowding [ 17 ], even if they are biased towards global shape processing [ 13 ]. Currently, CapsNets cannot be trained on large-scale tasks such as ImageNet because routing by agreement is computationally too expensive.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…In particular, much work with vernier and letter stimuli showed that even small changes to the contextual stimuli, or changes far away from the target, can lead to target-surround ungrouping and a considerable reduction in crowding (Kooi, Toet, Tripathy, & Levi, 1994; Manassi, Sayim, & Herzog, 2012; Manassi et al, 2016; Manassi, Hermens, Francis, & Herzog, 2015; Manassi et al, 2013; Saarela, Sayim, Westheimer, & Herzog, 2009), a phenomenon known as “uncrowding”. It has been argued that these results show a failure of feedforward pooling models, such as the SS model, and that this failure is due to their lack of recurrent processes of grouping and segmentation (Doerig et al, 2019; Doerig, Bornet, Choung, & Herzog, 2020; Herzog et al, 2015; Francis, Manassi, & Herzog, 2017). Furthermore, current SS model implementations also fail to capture the peripheral appearance of natural scenes that contain strong grouping and segmentation cues (Wallis et al, 2019).…”
Section: Introductionmentioning
confidence: 99%