2016
DOI: 10.1038/srep32672
|View full text |Cite
|
Sign up to set email alerts
|

Deep Networks Can Resemble Human Feed-forward Vision in Invariant Object Recognition

Abstract: Deep convolutional neural networks (DCNNs) have attracted much attention recently, and have shown to be able to recognize thousands of object categories in natural image databases. Their architecture is somewhat similar to that of the human visual system: both use restricted receptive fields, and a hierarchy of layers which progressively extract more and more abstracted features. Yet it is unknown whether DCNNs match human performance at the task of view-invariant object recognition, whether they make similar … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

8
113
2

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 164 publications
(123 citation statements)
references
References 89 publications
8
113
2
Order By: Relevance
“…Interestingly, further developments of these early computational models have led to modern deep convolutional neural networks (DCNNs), which have powered recent breakthroughs in computer vision 23 as well as many other domains. Although these network models are not constrained by experimental data, they have nonetheless been shown to provide an even better fit than earlier models to both behavioral 18,24,25 and electrophysiological 26,27 data (but see Ref. 28).…”
Section: Introductionmentioning
confidence: 99%
“…Interestingly, further developments of these early computational models have led to modern deep convolutional neural networks (DCNNs), which have powered recent breakthroughs in computer vision 23 as well as many other domains. Although these network models are not constrained by experimental data, they have nonetheless been shown to provide an even better fit than earlier models to both behavioral 18,24,25 and electrophysiological 26,27 data (but see Ref. 28).…”
Section: Introductionmentioning
confidence: 99%
“…[59,60]) including the CNN setting (e.g. [61,62]), lack the capability of generalisation to novel objects, which is a crucial requirement in prosthetic hand applications. To address this issue, we either need a very large amount of training data or we can capitalise on the flexibility of deep learning system to generalise based on learning abstract representation of different classes of training data.…”
Section: Object Classification Versus Grasp Identificationmentioning
confidence: 99%
“…The proposed DL‐MO framework was based on a fundamental hypothesis that there exists similarity between the state‐of‐the‐art CNN architectures and human visual system in object detection. This hypothesis was at least partially supported by several prior studies, which systematically investigated the correlation between CNN architectures and human neural response using carefully designed psychophysical experiments . Nonetheless, the corresponding degree of similarity varied a lot across different CNN architectures, which could be attributed to the different feature representation power of those CNNs.…”
Section: Discussionmentioning
confidence: 69%
“…experiments. [41][42][43] Nonetheless, the corresponding degree of similarity varied a lot across different CNN architectures, which could be attributed to the different feature representation power of those CNNs. If using a different pretrained CNN architecture in the DL-MO framework, the strength of the correlation and agreement with HOs may be altered.…”
Section: Discussionmentioning
confidence: 99%