2020
DOI: 10.48550/arxiv.2007.08558
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

On Robustness and Transferability of Convolutional Neural Networks

Abstract: Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts. However, several recent breakthroughs in transfer learning suggest that these networks can cope with severe distribution shifts and successfully adapt to new tasks from a few training examples. In this work we revisit the out-of-distribution and transfer performance of modern image classification CNNs and investigate the impact of the pre-training data size, the model scale, and the data preprocessi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
16
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 12 publications
(17 citation statements)
references
References 46 publications
(86 reference statements)
0
16
0
Order By: Relevance
“…As for CNNs, Djolonga et al (2020) evaluate the impact of the model size and dataset size on robustness, where classes at train and test time are the same, but there is a distribution shift in the data, for example changes in the lighting of image samples. They find that scaling both model size and training set size improves such robustness.…”
Section: Related Workmentioning
confidence: 99%
“…As for CNNs, Djolonga et al (2020) evaluate the impact of the model size and dataset size on robustness, where classes at train and test time are the same, but there is a distribution shift in the data, for example changes in the lighting of image samples. They find that scaling both model size and training set size improves such robustness.…”
Section: Related Workmentioning
confidence: 99%
“…where L q + denotes the subspace of almost everywhere non-negative functions of L q for (1/p) + (1/q) = 1. Here the Lagrangian L PI (θ, t, λ) is defined as L PI (θ, t, λ) = E (x,y)∼D t(x, y) + λ(x, δ, y) f θ (x + δ), y − t(x, y) dxdδdy = t(x, y) p(x, y) − λ(x, δ, y)dδ dxdy + λ(x, δ, y) f θ (x + δ), y dxdδdy, (14) where we used the density p of the data distribution D. Then, notice that (PV) can be written iteratively as P R = min θ∈Θ p(θ) where p(θ) = min t∈L p max…”
Section: B Proof Of Proposition 31mentioning
confidence: 99%
“…Adversarial robustness. As described in Section 1, it is well-know that state-of-the-art classifiers are susceptible to adversarial attacks [11][12][13][14][15][16][17]26]. Toward addressing this challenging, a rapidlygrowing body of work has provided attack algorithms to generate data perturbations that fool classifiers and defense algorithms which are designed to train robust classifiers to be robust against these perturbations.…”
Section: Further Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…These evaluations tend instead to focus on shifts between photos and stylized versions like sketches (Li et al, 2017;Venkateswara et al, 2017;Peng et al, 2019) or synthetic renderings (Peng et al, 2018), or between variants of digits datasets like MNIST (LeCun et al, 1998) and SVHN (Netzer et al, 2011). Unfortunately, prior work has shown that methods that work well on one type of shift need not generalize to others (Taori et al, 2020;Djolonga et al, 2020;Xie et al, 2021a;Miller et al, 2021), which raises the question of how well they would work on a wider array of realistic shifts.…”
Section: Introductionmentioning
confidence: 99%