Learning divisive normalization in primary visual cortex

Burg, Max F.; Cadena, Santiago A.; Denfield, George H.; Walker, Edgar Y.; Tolias, Andreas S.; Bethge, Matthias; Ecker, Alexander

doi:10.1371/journal.pcbi.1009028

Cited by 26 publications

(17 citation statements)

References 55 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These two approaches (goal-driven and measurement-driven deep models) have been thoroughly compared in V1 and were found to be superior to linear filter-banks and simple linear–nonlinear models ( Cadena et al., 2019 ). However, more recently, the same team has shown that linear–nonlinear models with general divisive normalization make a significant step towards the performance of state-of-the-art CNN with interpretable parameters ( Burg et al., 2021 ).…”

Section: Discussionmentioning

confidence: 99%

“…Following the move from conventional CNNs in Cadena et al. (2019) to more realistic divisive normalization models in Burg et al. (2021) , we think that future goal-driven derivations of low-level visual psychophysics (e.g., pattern masking or perceptual distortion) should include more realistic architectures too, as opposed to conventional CNNs (although they may be flexible enough to fulfill the goal).…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Contrast sensitivity functions in autoencoders

Gomez-Villa

Bertalmío

et al. 2022

Journal of Vision

View full text Add to dashboard Cite

Three decades ago, Atick et al. suggested that human frequency sensitivity may emerge from the enhancement required for a more efficient analysis of retinal images. Here we reassess the relevance of low-level vision tasks in the explanation of the contrast sensitivity functions (CSFs) in light of 1) the current trend of using artificial neural networks for studying vision, and 2) the current knowledge of retinal image representations. As a first contribution, we show that a very popular type of convolutional neural networks (CNNs), called autoencoders, may develop human-like CSFs in the spatiotemporal and chromatic dimensions when trained to perform some basic low-level vision tasks (like retinal noise and optical blur removal), but not others (like chromatic) adaptation or pure reconstruction after simple bottlenecks). As an illustrative example, the best CNN (in the considered set of simple architectures for enhancement of the retinal signal) reproduces the CSFs with a root mean square error of 11% of the maximum sensitivity. As a second contribution, we provide experimental evidence of the fact that, for some functional goals (at low abstraction level), deeper CNNs that are better in reaching the quantitative goal are actually worse in replicating human-like phenomena (such as the CSFs). This low-level result (for the explored networks) is not necessarily in contradiction with other works that report advantages of deeper nets in modeling higher level vision goals. However, in line with a growing body of literature, our results suggests another word of caution about CNNs in vision science because the use of simplified units or unrealistic architectures in goal optimization may be a limitation for the modeling and understanding of human vision.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Contrast sensitivity functions in autoencoders

Gomez-Villa

Bertalmío

et al. 2022

Journal of Vision

View full text Add to dashboard Cite

show abstract

“…We believe that a fruitful avenue for gaining insight into the brain's functional processing is to iteratively combine both approaches: developing new models to push the state-of-the-art predictive performance and at the same time extracting knowledge by simplifying complex models or by analyzing models post-hoc. For example, Burg et al (2021) simplified the state-of-the-art model by Cadena et al (2019) showing that divisive normalization accounts for most but not all of its performance; and Ustyuzhaninov et al ( 2022) simplified and analyzed the representations learned by a high-performing complex model revealing a combinatorial code of non-linear computations in mouse V1. Additionally, high performing predictive models may also benefit computational neuroscientists by serving as digital twins, creating an in silico environment in which hypotheses may be developed and refined before returning to the in vivo system for validation (Bashivan et al, 2019;Franke et al, 2021;Ponce et al, 2019;Walker et al, 2019).…”

Section: Discussionmentioning

confidence: 99%

“…The work on predictive models of neural responses to visual inputs has a long history that includes simple linearnonlinear (LN) models (Heeger, 1992a,b;Jones & Palmer, 1987), energy models (Adelson & Bergen, 1985), more general subunit/LN-LN models (Rust et al, 2005;Schwartz et al, 2006;Touryan et al, 2005;Vintch et al, 2015), and multi-layer neural network models (Lau et al, 2002;Lehky et al, 1992;Prenger et al, 2004;Zipser & Andersen, 1988). The deep learning revolution set new standards in prediction performance by leveraging task-optimized deep convolutional neural networks (CNNs) (Cadena et al, 2019;Cadieu et al, 2014;Yamins et al, 2014) and CNN-based architectures incorporating a shared encoding learned end-toend for thousands of neurons (Antolík et al, 2016;Bashiri et al, 2021;Batty et al, 2016;Burg et al, 2021;Cadena et al, 2019;Cowley & Pillow, 2020;Ecker et al, 2018;Franke et al, 2021;Kindel et al, 2017;Klindt et al, 2017;Lurz et al, 2020;McIntosh et al, 2016;Sinz et al, 2018;Walker et al, 2019;Zhang et al, 2018).…”

Section: Introductionmentioning

confidence: 99%

The Sensorium competition on predicting large-scale mouse primary visual cortex activity

Willeke¹,

Fahey²,

Bashiri³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

The neural underpinning of the biological visual system is challenging to study experimentally, in particular as the neuronal activity becomes increasingly nonlinear with respect to visual input. Artificial neural networks (ANNs) can serve a variety of goals for improving our understanding of this complex system, not only serving as predictive digital twins of sensory cortex for novel hypothesis generation in silico, but also incorporating bio-inspired architectural motifs to progressively bridge the gap between biological and machine vision. The mouse has recently emerged as a popular model system to study visual information processing, but no standardized large-scale benchmark to identify state-of-the-art models of the mouse visual system has been established.To fill this gap, we propose the SENSORIUM benchmark competition. We collected a large-scale dataset from mouse primary visual cortex containing the responses of more than 28,000 neurons across seven mice stimulated with thousands of natural images, together with simultaneous behavioral measurements that include running speed, pupil dilation, and eye movements. The benchmark challenge will rank models based on predictive performance for neuronal responses on a held-out test set, and includes two tracks for model input limited to either stimulus only (SENSORIUM) or stimulus plus behavior (SENSORIUM+). We provide a starting kit to lower the barrier for entry, including tutorials, pretrained baseline models, and APIs with one line commands for data loading and submission. We would like to see this as a starting point for regular challenges and data releases, and as a standard tool for measuring progress in large-scale neural system identification models of the mouse visual system and beyond.

show abstract

“…Regarding the nature of γ, the above example points out that the consideration of spatial neighborhoods is convenient to get the local contrast equalization along the visual field required to overcome shadow, scattering or fog. The recent literature that exploits automatic differentiation shows a range of kernel structures: some do not consider spatial interactions (either in a dense [11,16,24] or convolutional [20] combinations of features); while others do with some restrictions (either uniform weights [29], a ring of locations [18,19], or special symmetries in the space [30]). Following biology [3,8,21,28] and the intuition pointed out in the previous section.…”

Section: Models and Experimentsmentioning

confidence: 99%

Neural Networks with Divisive normalization for image segmentation with application in cityscapes dataset

Hernández-Cámara¹,

Laparra²,

Malo³

2022

Preprint

View full text Add to dashboard Cite

One of the key problems in computer vision is adaptation: models are too rigid to follow the variability of the inputs. The canonical computation that explains adaptation in sensory neuroscience is divisive normalization, and it has appealing effects on image manifolds. In this work we show that including divisive normalization in current deep networks makes them more invariant to non-informative changes in the images. In particular, we focus on U-Net architectures for image segmentation. Experiments show that the inclusion of divisive normalization in the U-Net architecture leads to better segmentation results with respect to conventional U-Net. The gain increases steadily when dealing with images acquired in bad weather conditions. In addition to the results on the Cityscapes and Foggy Cityscapes datasets, we explain these advantages through visualization of the responses: the equalization induced by the divisive normalization leads to more invariant features to local changes in contrast and illumination 1 .

show abstract

Learning divisive normalization in primary visual cortex

Cited by 26 publications

References 55 publications

Contrast sensitivity functions in autoencoders

Contrast sensitivity functions in autoencoders

The Sensorium competition on predicting large-scale mouse primary visual cortex activity

Neural Networks with Divisive normalization for image segmentation with application in cityscapes dataset

Contact Info

Product

Resources

About