Simulating a Primary Visual Cortex at the Front of CNNs Improves Robustness to Image Perturbations

Dapello, Joel; Marques, Tiago; Schrimpf, Martin; Geiger, Franziska; Cox, David D.; DiCarlo, James J.

doi:10.1101/2020.06.16.154542

Cited by 118 publications

(152 citation statements)

References 53 publications

Supporting

Mentioning

124

Contrasting

Order By: Relevance

“…Crucially, in our work we provide a direct link of the necessity of noise for systems that aim at optimizing decision behavior under our encoding and limited-capacity assumptions, which can be seen as algorithmic specifications of the more realistic population coding specifications mentioned above ( Nikitin et al, 2009 ). We argue that our results may provide a formal intuition for the apparent necessity of noise for improving training and learning performance in artificial neural networks ( Dapello et al, 2020 ; Findling and Wyart, 2020 ), and we speculate that an implementation of 'the right' noise distribution for a given environmental statistic could be seen as a potential mechanism to improve performance in capacity-limited agents generally speaking ( Garrett et al, 2011 ). We acknowledge that based on the results of our work, we cannot confirm whether this is the case for higher order neural circuits, however, we leave it as an interesting theoretical formulation, which could be addressed in future work.…”

Section: Discussionmentioning

confidence: 72%

Efficient sampling and noisy decisions

Heng

Woodford

Polanía

2020

eLife

View full text Add to dashboard Cite

Human decisions are based on finite information, which makes them inherently imprecise. But what determines the degree of such imprecision? Here, we develop an efficient coding framework for higher-level cognitive processes in which information is represented by a finite number of discrete samples. We characterize the sampling process that maximizes perceptual accuracy or fitness under the often-adopted assumption that full adaptation to an environmental distribution is possible, and show how the optimal process differs when detailed information about the current contextual distribution is costly. We tested this theory on a numerosity discrimination task, and found that humans efficiently adapt to contextual distributions, but in the way predicted by the model in which people must economize on environmental information. Thus, understanding decision behavior requires that we account for biological restrictions on information coding, challenging the often-adopted assumption of precise prior knowledge in higher-level decision systems.

show abstract

Section: Discussionmentioning

confidence: 72%

Efficient sampling and noisy decisions

Heng

Woodford

Polanía

2020

eLife

View full text Add to dashboard Cite

show abstract

“…But here too, if machines were burdened with humanlike visual acuity and so could barely represent the high-frequency features in the training set (i.e., the features most distorted by this sort of noise), they may be less sensitive to the patterns that later mislead them (74). Indeed, recent work finds that giving CNNs a humanlike fovea (75) or a hidden layer simulating V1 (76)…”

Section: Limit Machines Like Humansmentioning

confidence: 99%

Performance vs. competence in human–machine comparisons

Firestone

2020

Proc. Natl. Acad. Sci. U.S.A.

123

View full text Add to dashboard Cite

Does the human mind resemble the machines that can behave like it? Biologically inspired machine-learning systems approach “human-level” accuracy in an astounding variety of domains, and even predict human brain activity—raising the exciting possibility that such systems represent the world like we do. However, even seemingly intelligent machines fail in strange and “unhumanlike” ways, threatening their status as models of our minds. How can we know when human–machine behavioral differences reflect deep disparities in their underlying capacities, vs. when such failures are only superficial or peripheral? This article draws on a foundational insight from cognitive science—the distinction between performance and competence—to encourage “species-fair” comparisons between humans and machines. The performance/competence distinction urges us to consider whether the failure of a system to behave as ideally hypothesized, or the failure of one creature to behave like another, arises not because the system lacks the relevant knowledge or internal capacities (“competence”), but instead because of superficial constraints on demonstrating that knowledge (“performance”). I argue that this distinction has been neglected by research comparing human and machine behavior, and that it should be essential to any such comparison. Focusing on the domain of image classification, I identify three factors contributing to the species-fairness of human–machine comparisons, extracted from recent work that equates such constraints. Species-fair comparisons level the playing field between natural and artificial intelligence, so that we can separate more superficial differences from those that may be deep and enduring.

show abstract

“…In our case, the human constraint – limited visual acuity – could play a vital role in incorporating LSF information for object categorisation. In fact, modelling a human fovea (Deza & Konkle, 2020) or primary visual cortex (Dapello et al, 2020) at the front of CNNs can increase their robustness to adversarial examples. Note that adversarial examples usually include subtle changes to images at high spatial frequencies.…”

Section: Discussionmentioning

confidence: 99%

Training for object recognition with increasing spatial frequency: A comparison of deep learning with human vision

Avberšek

Zeman

Beeck

2021

Preprint

View full text Add to dashboard Cite

The ontogenetic development of human vision, and the real-time neural processing of visual input, both exhibit a striking similarity – a sensitivity towards spatial frequencies that progress in a coarse-to-fine manner. During early human development, sensitivity for higher spatial frequencies increases with age. In adulthood, when humans receive new visual input, low spatial frequencies are typically processed first before subsequently guiding the processing of higher spatial frequencies. We investigated to what extent this coarse-to-fine progression might impact visual representations in artificial vision and compared this to adult human representations. We simulated the coarse-to-fine progression of image processing in deep convolutional neural networks (CNNs) by gradually increasing spatial frequency information during training. We compared CNN performance, after standard and coarse-to-fine training, with a wide range of datasets from behavioural and neuroimaging experiments. In contrast to humans, CNNs that are trained using the standard protocol are very insensitive to low spatial frequency information, showing very poor performance in being able to classify such object images. By training CNNs using our coarse-to-fine method, we improved the classification accuracy of CNNs from 0% to 32% on low-pass filtered images taken from the ImageNet dataset. When comparing differently trained networks on images containing full spatial frequency information, we saw no representational differences. Overall, this integration of computational, neural, and behavioural findings shows the relevance of the exposure to and processing of input with a variation in spatial frequency content for some aspects of high-level object representations.

show abstract

Simulating a Primary Visual Cortex at the Front of CNNs Improves Robustness to Image Perturbations

Cited by 118 publications

References 53 publications

Efficient sampling and noisy decisions

Efficient sampling and noisy decisions

Performance vs. competence in human–machine comparisons

Training for object recognition with increasing spatial frequency: A comparison of deep learning with human vision

Contact Info

Product

Resources

About