Despite great efforts over several decades, our best models of primary visual cortex (V1) still predict neural responses quite poorly when probed with natural stimuli, highlighting our limited understanding of the nonlinear computations in V1. At the same time, recent advances in machine learning have shown that deep neural networks can learn highly nonlinear functions for visual information processing.Two approaches based on deep learning have recently been successfully applied to neural data: transfer learning for predicting neural activity in higher areas of the primate ventral stream and data-driven models to predict retina and V1 neural activity of mice. However, so far there exists no comparison between the two approaches and neither of them has been used to model the early primate visual system. Here, we test the ability of both approaches to predict neural responses to natural images in V1 of awake monkeys. We found that both deep learning approaches outperformed classical linearnonlinear and wavelet-based feature representations building on existing V1 encoding theories. On our dataset, transfer learning and data-driven models performed similarly, while the data-driven model employed a much simpler architecture. Thus, multi-layer CNNs set the new state of the art for predicting neural responses to natural images in primate V1. Having such good predictive in-silico models opens the door for quantitative studies of yet unknown nonlinear computations in V1 without being limited by the available experimental time.
Despite great efforts over several decades, our best models of primary visual cortex (V1) still predict spiking activity quite poorly when probed with natural stimuli, highlighting our limited understanding of the nonlinear computations in V1. Recently, two approaches based on deep learning have emerged for modeling these nonlinear computations: transfer learning from artificial neural networks trained on object recognition and data-driven convolutional neural network models trained end-to-end on large populations of neurons. Here, we test the ability of both approaches to predict spiking activity in response to natural images in V1 of awake monkeys. We found that the transfer learning approach performed similarly well to the data-driven approach and both outperformed classical linear-nonlinear and wavelet-based feature representations that build on existing theories of V1. Notably, transfer learning using a pre-trained feature space required substantially less experimental time to achieve the same performance. In conclusion, multi-layer convolutional neural networks (CNNs) set the new state of the art for predicting neural responses to natural images in primate V1 and deep features learned for object recognition are better explanations for V1 computation than all previous filter bank theories. This finding strengthens the necessity of V1 models that are multiple nonlinearities away from the image domain and it supports the idea of explaining early visual cortex based on high-level functional goals.
The McGurk effect is an illusion in which visual speech information dramatically alters the perception of auditory speech. However, there is a high degree of individual variability in how frequently the illusion is perceived: some individuals almost always perceive the McGurk effect, while others rarely do. Another axis of individual variability is the pattern of eye movements make while viewing a talking face: some individuals often fixate the mouth of the talker, while others rarely do. Since the talker's mouth carries the visual speech necessary information to induce the McGurk effect, we hypothesized that individuals who frequently perceive the McGurk effect should spend more time fixating the talker's mouth. We used infrared eye tracking to study eye movements as 40 participants viewed audiovisual speech. Frequent perceivers of the McGurk effect were more likely to fixate the mouth of the talker, and there was a significant correlation between McGurk frequency and mouth looking time. The noisy encoding of disparity model of McGurk perception showed that individuals who frequently fixated the mouth had lower sensory noise and higher disparity thresholds than those who rarely fixated the mouth. Differences in eye movements when viewing the talker's face may be an important contributor to interindividual differences in multisensory speech perception.
The rise of big data in modern research poses serious challenges for data management: Large and intricate datasets from diverse instrumentation must be precisely aligned, annotated, and processed in a variety of ways to extract new insights. While high levels of data integrity are expected, research teams have diverse backgrounds, are geographically dispersed, and rarely possess a primary interest in data science. Here we describe DataJoint, an open-source toolbox designed for manipulating and processing scientific data under the relational data model. Designed for scientists who need a flexible and expressive database language with few basic concepts and operations, DataJoint facilitates multiuser access, efficient queries, and distributed computing. With implementations in both MATLAB and Python, DataJoint is not limited to particular file formats, acquisition systems, or data modalities and can be quickly adapted to new experimental designs. DataJoint and related resources are available at http://datajoint.github.com.
To better understand the representations in visual cortex, we need to generate better predictions of neural activity in awake animals presented with their ecological input: natural video. Despite recent advances in models for static images, models for predicting responses to natural video are scarce and standard linear-nonlinear models perform poorly. We developed a new deep recurrent network architecture that predicts inferred spiking activity of thousands of mouse V1 neurons simultaneously recorded with two-photon microscopy, while accounting for confounding factors such as the animal's gaze position and brain state changes related to running state and pupil dilation. Powerful system identification models provide an opportunity to gain insight into cortical functions through in silico experiments that can subsequently be tested in the brain. However, in many cases this approach requires that the model is able to generalize to stimulus statistics that it was not trained on, such as band-limited noise and other parameterized stimuli. We investigated these domain transfer properties in our model and find that our model trained on natural images is able to correctly predict the orientation tuning of neurons in responses to artificial noise stimuli. Finally, we show that we can fully generalize from movies to noise and maintain high predictive performance on both stimulus domains by fine-tuning only the final layer's weights on a network otherwise trained on natural movies. The converse, however, is not true.
4Bayesian models of behavior suggest that organisms represent uncertainty as-5 sociated with sensory variables. However, the neural code of uncertainty re-6 mains elusive. A central hypothesis is that uncertainty is encoded in the pop-7 ulation activity of cortical neurons in the form of likelihood functions. We 8 studied the neural code of uncertainty by simultaneously recording popula-9 tion activity from the primate visual cortex during a visual categorization task 10 in which trial-to-trial uncertainty about stimulus orientation was relevant for 11 the decision. We decoded the likelihood function from the trial-to-trial popula-12 tion activity and found that it predicted decisions better than a point estimate 13 of orientation. This remained true when we conditioned on the true orienta- 14 tion, suggesting that internal fluctuations in neural activity drive behaviorally 15 meaningful variations in the likelihood function. Our results establish the role 16 of population-encoded likelihood functions in mediating behavior, and provide 17 a neural underpinning for Bayesian models of perception. 18 When making perceptual decisions, organisms often benefit from representing uncertainty 19 about sensory variables. More specifically, the theory that the brain performs Bayesian inference-20 which has roots in the work of Laplace 1 and von Helmholtz 2 -has been widely used to explain 21 human and animal perception [3][4][5][6] . At its core lies the assumption that the brain maintains a sta-22 tistical model of the world and when confronted with incomplete and imperfect information, 23 makes inferences by computing probability distributions over task-relevant world state vari-24 ables (e.g. direction of motion of a stimulus). In spite of the prevalence of Bayesian theories 25 in neuroscience, evidence to support them stems primarily from behavioral studies (e.g. 7,8 ). 26 Consequently, the manner in which probability distributions are encoded in the brain remains 27 unclear, and, thus, the neural code of uncertainty is unknown. 28 It has been hypothesized that a critical feature of the neural code of uncertainty, which 29 is shared throughout the sensory processing chain in the neocortex, is that the same neurons 30 that encode a specific world state variable (e.g. stimulus orientation in V1) also encode the 31 uncertainty about that variable ( Fig. 1a). Therefore neurons multiplex both a point estimate 32 of a sensory variable and the associated uncertainty about it 9,10 . Specifically, according to the 33 probabilistic population coding (PPC) hypothesis 9,10 , inference in the brain is performed by 34 inverting a generative model of neural population activity. Under this coding scheme, neural 35 populations in V1, for example, that encode stimulus orientation also encode the associated 36 tion on behavior. We found that using the trial-to-trial changes in the shape of the likelihood 51 function allowed us to better predict the behavior than using a likelihood function with a fixed 52 shape shifted by a...
Bayesian models of behavior suggest that organisms represent uncertainty associated with sensory variables. However, the neural code of uncertainty remains elusive. A central hypothesis is that uncertainty is encoded in the population activity of cortical neurons in the form of likelihood functions. We studied the neural code of uncertainty by simultaneously recording population activity from the primate visual cortex during a visual categorization task in which trial-to-trial uncertainty about stimulus orientation was relevant for 1 . CC-BY-NC-ND 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.