Deep Problems with Neural Network Models of Human Vision

Bowers, Jeffrey S.; Dujmović, Marin; Montero, Milton Llera; Tsvetkov, Christian; Biscione, Valerio; Puebla, Guillermo; Adolfi, Federico; Hummel, John E.; Heaton, Rachel; Evans, Benjamin; Mitchell, Jeff; Blything, Ryan

doi:10.31234/osf.io/5zf4s

Cited by 23 publications

(15 citation statements)

References 102 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We have argued for this approach in relation to making inferences about mechanistic similarity between DNNs and humans [29]. In fact, research relating DNNs to human vision provides a striking case of a disconnect between RSA and behavioural findings from psychology [29–31]. The findings here may explain contradictory RSA scores between DNNs and human visual processing as pointed out by Xu and Vaziri-Pashkam [20].…”

Section: Discussionmentioning

confidence: 64%

“…However, another approach is more tractable: conduct controlled experiments to establish whether the two systems are representing information in similar ways. We have argued for this approach in relation to making inferences about mechanistic similarity between DNNs and humans [29]. In fact, research relating DNNs to human vision provides a striking case of a disconnect between RSA and behavioural findings from psychology [29–31].…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Obstacles to inferring mechanistic similarity using Representational Similarity Analysis

Dujmović

Bowers

Adolfi

2022

Preprint

Self Cite

View full text Add to dashboard Cite

A core challenge in neuroscience is to assess whether diverse systems represent the world similarly. Representational Similarity Analysis (RSA) is an innovative approach to address this problem and has become increasingly popular across disciplines from machine learning to computational neuroscience. Despite these successes, RSA regularly uncovers difficult-to-reconcile and contradictory findings. Here we demonstrate the pitfalls of using RSA to infer representational similarity and explain how contradictory findings arise and support false inferences when left unchecked. By comparing neural representations in primate, human and computational models, we reveal two problematic phenomena that are ubiquitous in current research: a 'mimic' effect, where confounds in stimuli can lead to high RSA scores between provably dissimilar systems, and a 'modulation effect', where RSA-scores become dependent on stimuli used for testing. Since our results bear on existing findings and inferences, we provide recommendations to avoid these pitfalls and sketch a way forward.

show abstract

Section: Discussionmentioning

confidence: 64%

Section: Discussionmentioning

confidence: 99%

Obstacles to inferring mechanistic similarity using Representational Similarity Analysis

Dujmović

Bowers

Adolfi

2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…These failures may reflect a range of processes present in humans but absent in CNNs trained to recognise objects through supervised learning, such as figure-ground segregation, completing objects behind occluders, encoding border ownership, and inferring 3D properties about the object [43]. Consistent with this hypothesis, Jacob et al [27] and Bowers et al [7] have recently highlighted a number of these failures in CNNs, including a failure to represent 3D structure, occlusion, and parts of objects. More broadly, these results challenge the the claim that CNNs trained to recognise objects through supervised learning are good models of the ventral visual stream of human vision (see, for example, [52, 8, 37]).…”

Section: Discussionmentioning

confidence: 99%

Human shape representations are not an emergent property of learning to classify objects

Dujmović

Hummel

Bowers

2021

Preprint

Self Cite

View full text Add to dashboard Cite

The success of Convolutional Neural Networks (CNNs) in classifying objects has led to a surge of interest in using these systems to understand human vision. Recent studies have argued that when CNNs are trained in the correct learning environment, they can emulate a key property of human vision -- learning to classify objects based on their shape. While showing a shape-bias is indeed a desirable property for any model of human object recognition, it is unclear whether the resulting shape representations learned by these networks are human-like. We explored this question in the context of a well-known observation from psychology showing that humans encode the shape of objects in terms of relations between object features. To check whether this is also true for the representations of CNNs, we ran a series of simulations where we trained CNNs on datasets of novel shapes and tested them on a set of controlled deformations of these shapes. We found that CNNs do not show any enhanced sensitivity to deformations which alter relations between features, even when explicitly trained on such deformations. This behaviour contrasted with human participants in previous studies as well as in a new experiment. We argue that these results are a consequence of a fundamental difference between how humans and CNNs learn to recognise objects: while CNNs select features that allow them to optimally classify the proximal stimulus, humans select features that they infer to be properties of the distal stimulus. This makes human representations more generalisable to novel contexts and tasks.

show abstract

“…But using ANNs in that way requires careful construction and comparison to ensure meaningful inferences can be drawn precisely because of these (and other) disanalogies baked into the technology. Failure to account for this can lead to misleading conclusions and faulty science [14]. On the other hand, nothing we have said means that ANNs cannot be brought further in line with biology to fruitful ends.…”

Section: Ontological Unificationmentioning

confidence: 97%

“…On a basic level, the tendency to ignore domain experts (Section 4.1) and the issues around path dependency (Section 4.2) may make ML systems less effective tools, which for systems that are so widely used has immediate social welfare implications. 14 And as we discussed in Section 4.4, the trend of increased black-boxing itself has distinct ethical implications related to subjects' rights to explanations.…”

Section: Ethicalmentioning

confidence: 99%

Should attention be all we need? The epistemic and ethical implications of unification in machine learning

Fishman,

Hancox-Li

2022

Preprint

View full text Add to dashboard Cite

LEIF HANCOX-LI * , Capital One, USA "Attention is all you need" has become a fundamental precept in machine learning research. Originally designed for machine translation, transformers and the attention mechanisms that underpin them now find success across many problem domains. With the apparent domain-agnostic success of transformers, many researchers are excited that similar model architectures can be successfully deployed across diverse applications in vision, language and beyond. We consider the benefits and risks of these waves of unification on both epistemic and ethical fronts. On the epistemic side, we argue that many of the arguments in favor of unification in the natural sciences fail to transfer over to the machine learning case, or transfer over only under assumptions that might not hold. Unification also introduces epistemic risks related to portability, path dependency, methodological diversity, and increased black-boxing. On the ethical side, we discuss risks emerging from epistemic concerns, further marginalizing underrepresented perspectives, the centralization of power, and having fewer models across more domains of application.

show abstract

Deep Problems with Neural Network Models of Human Vision

Cited by 23 publications

References 102 publications

Obstacles to inferring mechanistic similarity using Representational Similarity Analysis

Obstacles to inferring mechanistic similarity using Representational Similarity Analysis

Human shape representations are not an emergent property of learning to classify objects

Should attention be all we need? The epistemic and ethical implications of unification in machine learning

Contact Info

Product

Resources

About