2022
DOI: 10.48550/arxiv.2202.08087
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Extended Unconstrained Features Model for Exploring Deep Neural Collapse

Abstract: The modern strategy for training deep neural networks for classification tasks includes optimizing the network's weights even after the training error vanishes to further push the training loss toward zero. Recently, a phenomenon termed "neural collapse" (NC) has been empirically observed in this training procedure. Specifically, it has been shown that the learned features (the output of the penultimate layer) of within-class samples converge to their mean, and the means of different classes exhibit a certain … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
17
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(18 citation statements)
references
References 13 publications
(21 reference statements)
1
17
0
Order By: Relevance
“…The underlying reasoning is that modern deep networks are often highly overparameterized with the capacity of learning any representations [19][20][21][22], so that the last-layer features can approximate, or interpolate, any point in the feature space. Under the unconstrained feature model, the work [14][15][16][23][24][25][26] showed that the N C solutions are the only global optimal solution for nonconvex training losses under different settings. However, given the nonconvexity of the problem, even under the unconstrained feature model these global optimality results do guarantee that the N C solutions can be efficiently achieved.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…The underlying reasoning is that modern deep networks are often highly overparameterized with the capacity of learning any representations [19][20][21][22], so that the last-layer features can approximate, or interpolate, any point in the feature space. Under the unconstrained feature model, the work [14][15][16][23][24][25][26] showed that the N C solutions are the only global optimal solution for nonconvex training losses under different settings. However, given the nonconvexity of the problem, even under the unconstrained feature model these global optimality results do guarantee that the N C solutions can be efficiently achieved.…”
Section: Introductionmentioning
confidence: 99%
“…In the meanwhile, the MSE loss is not only appealing for its algebraic simplicity, but it also demonstrates on-par or even better generalization performances compared to the CE loss, as reported by recent line of work [10]. However, the theoretical study of MSE loss for N C is still limited [12,15,26]. Under the unconstrained feature model, their work proved that the continuous gradient flow of the MSE loss converges to N C solutions.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…They prove that neural collapse emerges under the CE loss with proper constraint or regularization. Other studies prove that the MSE loss also leads to the neural collapse solution [6,12,13,14]. [31] proposes a convex formulation for a norm-regularized ReLU network and explains neural collapse accordingly.…”
Section: Neural Collapsementioning
confidence: 99%
“…It has been shown that the optimality for LPM under the CE loss leads to the neural collapse solution with feature norm constraints [8,5,7,9], regularization [10], or even no explicit constraint [11]. Other studies turn to analyze the unconstrained LPM under the mean squared error (MSE) loss and also derive the neural collapse solution [6,12,13,14].…”
Section: Introductionmentioning
confidence: 99%