2021
DOI: 10.48550/arxiv.2101.07528
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

The Unreasonable Effectiveness of Patches in Deep Convolutional Kernels Methods

Abstract: A recent line of work showed that various forms of convolutional kernel methods can be competitive with standard supervised deep convolutional networks on datasets like CIFAR-10, obtaining accuracies in the range of 87 − 90% while being more amenable to theoretical analysis. In this work, we highlight the importance of a data-dependent feature extraction step that is key to the obtain good performance in convolutional kernel methods. This step typically corresponds to a whitened dictionary of patches, and give… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1
1
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 27 publications
(51 reference statements)
0
2
0
Order By: Relevance
“…Similar convergences are observed for auditory processing, as activity of CNN trained on auditory tasks can predict neural activity in the corresponding cortices [59], or for networks that predict fMRI and MEG responses during language processing tasks [60]. Cutting-edge developments in artificial visual systems incorporate ideas from natural language processing such as transformers or local context embedding [61,62]. Researchers are still struggling to understand precisely how these mechanisms operate, or why they achieve such high performances.…”
Section: Hierarchical Processing Of Sensory Inputsmentioning
confidence: 71%
“…Similar convergences are observed for auditory processing, as activity of CNN trained on auditory tasks can predict neural activity in the corresponding cortices [59], or for networks that predict fMRI and MEG responses during language processing tasks [60]. Cutting-edge developments in artificial visual systems incorporate ideas from natural language processing such as transformers or local context embedding [61,62]. Researchers are still struggling to understand precisely how these mechanisms operate, or why they achieve such high performances.…”
Section: Hierarchical Processing Of Sensory Inputsmentioning
confidence: 71%
“…Many works have explored the effectiveness of path-based image features. In the supervised setting, Bagnet [Brendel and Bethge, 2018] and Thiry et al [2021] showed that aggregation of patch-based features can achieve most of the performance of supervised learning on Image datasets. In the unsupervised setting, Gidaris et al [2020] performs SSL by requiring a bag-of patches representation to be invariant between different views.…”
Section: Patch-based Representationmentioning
confidence: 99%