Controversial stimuli: Pitting neural networks against each other as models of human cognition

Golan, Tal; Raju, Prashant C.; Kriegeskorte, Nikolaus

doi:10.1073/pnas.1912334117

Cited by 77 publications

(78 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is important to note that finding such improved models remains necessary: Both for neural response pattern and behavioral consistency metrics, our results show that there remains a substantial gap between all models (supervised and unsupervised) and the noise ceiling of the data: there is reliable neural and behavioral variance that no model correctly predicts. These quantitative gaps may be related to other qualitative inconsistencies between neural network models and human visual behaviors, including the latter’s susceptibility to adversarial and “controversial” examples ( 70 ) and their different texture-vs.-shape biases ( 71 ).…”

Section: Discussionmentioning

confidence: 99%

Unsupervised neural network models of the ventral visual stream

Zhuang

Yan

Nayebi

et al. 2021

Proc. Natl. Acad. Sci. U.S.A.

236

197

View full text Add to dashboard Cite

Deep neural networks currently provide the best quantitative models of the response patterns of neurons throughout the primate ventral visual stream. However, such networks have remained implausible as a model of the development of the ventral stream, in part because they are trained with supervised methods requiring many more labels than are accessible to infants during development. Here, we report that recent rapid progress in unsupervised learning has largely closed this gap. We find that neural network models learned with deep unsupervised contrastive embedding methods achieve neural prediction accuracy in multiple ventral visual cortical areas that equals or exceeds that of models derived using today’s best supervised methods and that the mapping of these neural network models’ hidden layers is neuroanatomically consistent across the ventral stream. Strikingly, we find that these methods produce brain-like representations even when trained solely with real human child developmental data collected from head-mounted cameras, despite the fact that these datasets are noisy and limited. We also find that semisupervised deep contrastive embeddings can leverage small numbers of labeled examples to produce representations with substantially improved error-pattern consistency to human behavior. Taken together, these results illustrate a use of unsupervised learning to provide a quantitative model of a multiarea cortical brain system and present a strong candidate for a biologically plausible computational theory of primate sensory learning.

show abstract

Section: Discussionmentioning

confidence: 99%

Unsupervised neural network models of the ventral visual stream

Zhuang

Yan

Nayebi

et al. 2021

Proc. Natl. Acad. Sci. U.S.A.

236

197

View full text Add to dashboard Cite

show abstract

“…One way to carry out this task is by examining the different experimental settings that make the model fail, known as adversarial examples, [24], which has a long tradition in cognitive psychology, from the use of visual illusions to study perception to the characterisation of biases in decision making [43]. Another method is to train many models with different goals, and to examine which models best describe human behaviour [23]. Here we used cognitive models that provide explicit predictions, reward-oriented and reward-oblivious models, to characterise the performance of our general DNN.…”

Section: Discussionmentioning

confidence: 99%

“…Several researchers used different approaches to overcome this problem. One approach was to train many different models with different goals, and examine how they perform in predicting human behaviour, thus controlling for the model’s goal [23], and another approach was to use adversarial examples that meant misleading a model and thus gaining insights on its operations [24]. We suggest another direction, which is to use tools from cognitive neuroscience, the same explicit cognitive models described above, to characterise and explain the operations of such a data-driven black box model.…”

Section: Introductionmentioning

confidence: 99%

Using Deep Learning to Predict Human Decisions, and Cognitive Models to Explain Deep Learning Models

Fintz

Osadchy

Hertz

2021

Preprint

View full text Add to dashboard Cite

Deep neural networks (DNN) models have the potential to provide new insights in the study of human decision making, due to their high capacity and data-driven design. While these models may be able to go beyond theory-driven models in predicting human behaviour, their opaque nature limits their ability to explain how an operation is carried out. This explainability problem remains unresolved. Here we demonstrate the use of a DNN model as an exploratory tool to identify predictable and consistent human behaviour in value-based decision making beyond the scope of theory-driven models. We then propose using theory-driven models to characterise the operation of the DNN model. We trained a DNN model to predict human decisions in a four-armed bandit task. We found that this model was more accurate than a reinforcement-learning reward-oriented model geared towards choosing the most rewarding option. This disparity in accuracy was more pronounced during times when the expected reward from all options was similar, i.e., no unambiguous good option. To investigate this disparity, we introduced a reward-oblivious model, which was trained to predict human decisions without information about the rewards obtained from each option. This model captured decision-sequence patterns made by participants (e.g., a-b-c-d). In a series of experimental offline simulations of all models we found that the general model was in line with a reward-oriented model’s predictions when one option was clearly better than the others.However, when options’ expected rewards were similar to each other, it was in-line with the reward-oblivious model’s pattern completion predictions. These results indicate the contribution of predictable but task-irrelevant decision patterns to human decisions, especially when task-relevant choices are not immediately apparent. Importantly, we demonstrate how theory-driven cognitive models can be used to characterise the operation of DNNs, making them a useful explanatory tool in scientific investigation.Author SummaryDeep neural networks (DNN) models are an extremely useful tool across multiple domains, and specifically for performing tasks that mimic and predict human behaviour. However, due to their opaque nature and high level of complexity, their ability to explain human behaviour is limited. Here we used DNN models to uncover hitherto overlooked aspects of human decision making, i.e., their reliance on predictable patterns for exploration. For this purpose, we trained a DNN model to predict human choices in a decision-making task. We then characterised this data-driven model using explicit, theory-driven cognitive models, in a set of offline experimental simulations. This relationship between explicit and data-driven approaches, where high-capacity models are used to explore beyond the scope of established models and theory-driven models are used to explain and characterise these new grounds, make DNN models a powerful scientific tool.

show abstract

“…For stimuli in Experiments 3a and 3b: Patterns of gloss constancy we selected two sets of 10 sequences for which (a) both the unsupervised and supervised models predicted deviations from constant gloss, and (b) the models made different predictions about the particular pattern of deviations (see Methods). The rationale behind this is that cases where models disagree provide the strongest power to test which model is superior 88, 89 .…”

Section: Resultsmentioning

confidence: 99%

“…To create a strong test of the different models, we wanted to probe human gloss constancy using stimuli (a) for which there were clear differences between constancy patterns predicted by unsupervised vs supervised models 88, 89 , and (b) which were likely to produce diverse patterns of failures of constancy in human observers. We therefore first generated candidate stimuli, then selected those that best satisfied these desiderata.…”

Section: Methodsmentioning

confidence: 99%

Unsupervised learning predicts human perception and misperception of gloss

Storrs

Fleming

2020

Preprint

View full text Add to dashboard Cite

1 Summary Gloss perception is a challenging visual inference that requires disentangling the contributions of reflectance, lighting, and shape to the retinal image [1][2][3]. Learning to see gloss must somehow proceed without labelled training data as no other sensory signals can provide the 'ground truth' required for supervised learning [4][5][6]. We reasoned that paradoxically, we may learn to infer distal scene properties, like gloss, by learning to compress and predict spatial structure in proximal image data. We hypothesised that such unsupervised learning might explain both successes and failures of human gloss perception, where classical 'inverse optics' cannot. To test this, we trained unsupervised neural networks to model the pixel statistics of renderings of glossy surfaces and compared the resulting representations with human gloss judgments.The trained networks spontaneously cluster images according to underlying scene properties such as specular reflectance, shape and illumination, despite receiving no explicit information about them. More importantly, we find that linearly decoding specular reflectance from the model's internal code predicts human perception and misperception of glossiness on an image-by-image basis better than the true physical reflectance does, better than supervised networks explicitly trained to estimate specular reflectance, and better than alternative image statistic and dimensionality-reduction models. Indeed, the unsupervised networks correctly predict well-known illusions of gloss perception caused by interactions between surface relief and lighting [7,8] which the supervised models totally fail to predict. Our findings suggest that unsupervised learning may explain otherwise inexplicable errors in surface perception, with broader implications for how biological brains learn to see the outside world.2 Highlights  We trained unsupervised neural networks to synthesise images of glossy surfaces  They spontaneously learned to encode gloss, lighting and other scene factors  The networks correctly predict both errors and successes of human gloss perception  The findings provide new insights into how the brain likely learns to see 3 Results and DiscussionThe central intuition behind our findings is that learning to compress the complex image structure created by reflections from glossy surfaces into a highly compact code forces the brain to discover representations that partially-but imperfectly-disentangle the distal physical factors responsible for variations within and between images. This potentially explains both the broad successes and specific pattern of errors known to occur in gloss perception [2,3,[7][8][9][10][11][12]. To test this, we rendered 10,000 images from a virtual world of bumpy frontoparallel surfaces with either high or low specular reflectance, random colour and depth of surface relief, illuminated by six natural light fields (Figure 1A-B). We trained ten instances of an unsupervised PixelVAE network [13,14] on this dataset. The model's training objecti...

show abstract

Controversial stimuli: Pitting neural networks against each other as models of human cognition

Cited by 77 publications

References 24 publications

Unsupervised neural network models of the ventral visual stream

Unsupervised neural network models of the ventral visual stream

Using Deep Learning to Predict Human Decisions, and Cognitive Models to Explain Deep Learning Models

Unsupervised learning predicts human perception and misperception of gloss

Contact Info

Product

Resources

About