2022
DOI: 10.48550/arxiv.2202.06438
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Learning from Randomly Initialized Neural Network Features

Abstract: We present the surprising result that randomly initialized neural networks are good feature extractors in expectation. These random features correspond to finite-sample realizations of what we call Neural Network Prior Kernel (NNPK), which is inherently infinite-dimensional. We conduct ablations across multiple architectures of varying sizes as well as initializations and activation functions. Our analysis suggests that certain structures that manifest in a trained model are already present at initialization. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 33 publications
0
2
0
Order By: Relevance
“…Here we list two reasons based on the observations from previous work. First randomly initialized networks are reported to produce powerful representations for multiple computer vision tasks (Saxe et al, 2011;Cao et al, 2018;Amid et al, 2022). Second, such random networks are showed to perform a distancepreserving embedding of the data, i.e.…”
Section: Discussionmentioning
confidence: 99%
“…Here we list two reasons based on the observations from previous work. First randomly initialized networks are reported to produce powerful representations for multiple computer vision tasks (Saxe et al, 2011;Cao et al, 2018;Amid et al, 2022). Second, such random networks are showed to perform a distancepreserving embedding of the data, i.e.…”
Section: Discussionmentioning
confidence: 99%
“…The NPCK bears a close similarity to the neural network prior kernel (NNPK), which is defined by Amid et al (2022) as…”
Section: Neural Posterior Correlation Kernelmentioning
confidence: 99%
“…ii) Randomly initialized models are not valid embedding functions for the Maximum Mean Discrepancy (MMD) [18] estimation used in DM. Specifically, DM justifies the validity of randomly-initialized models by their intrinsic classification power observed in tasks such as deep clustering [3,5,6,36]. However, we believe that this does not apply to DM as randomly-initialized models do not satisfy the requirement of embedding functions used in MMD and makes it an invalid measure of distribution distance.…”
Section: Optimization-oriented Methodsmentioning
confidence: 98%