2019
DOI: 10.1088/1742-5468/ab3281
|View full text |Cite
|
Sign up to set email alerts
|

Comparing dynamics: deep neural networks versus glassy systems

Abstract: We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems. The two main issues we address are (1) the complexity of the loss landscape and of the dynamics within it, and (2) to what extent DNNs share similarities with glassy systems. Our findings, obtained for different architectures and datasets, suggest that during the training process the dynamics slows down because of an increasingly large number of flat directions. At lar… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
79
0
1

Year Published

2019
2019
2023
2023

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 71 publications
(82 citation statements)
references
References 40 publications
2
79
0
1
Order By: Relevance
“…The result suggests that the slow dynamics at the shorter time scales reflect the glassy aspect of the free-landscape and the truncation of the slow dynamics at longer time scales reflect the presence of the liquid phase at the center. It is interesting to note that similar truncation of the slow dynamics has been observed in a study of SGD dynamics in DNNs [18].…”
Section: Relaxational Dynamics Of a Soft-core Model With Random Inputsupporting
confidence: 76%
See 1 more Smart Citation
“…The result suggests that the slow dynamics at the shorter time scales reflect the glassy aspect of the free-landscape and the truncation of the slow dynamics at longer time scales reflect the presence of the liquid phase at the center. It is interesting to note that similar truncation of the slow dynamics has been observed in a study of SGD dynamics in DNNs [18].…”
Section: Relaxational Dynamics Of a Soft-core Model With Random Inputsupporting
confidence: 76%
“…Understanding the nature of such glass transitions and jamming is a fundamental problem in CSPs since it is intimately related to efficiency of algorithms to solve CSPs. In the context of DNN, it is certainly important to understand the characteristics of the free-energy landscape to understand the efficiently of various learning algorithms for DNNs [18][19][20].…”
mentioning
confidence: 99%
“…This is of interest in a variety of fields [259], including population dynamics [260,261], models of evolutionary biology [54,262], spin-glasses [215,[263][264][265][266][267][268][269][270][271], neural networks [272] as well as in landscape based string theory [273,274] and cosmology [275]. Recently, there has been a surge of interest in these questions in the context of deep learning and artificial intelligence [276][277][278][279]. Discussing these issues is beyond the scope of this review and we may refer the reader to recent interesting reviews mentioned above.…”
Section: Discussionmentioning
confidence: 99%
“…In the left panel the weights are Gaussian (for both the teacher and the student), while in the center panel they are binary/Rademacher (recall that (H3) in Theorem 3.1 can be relaxed to include this case, see [11]). The full line is obtained from the xed point of the state evolution (SE) of the AMP algorithm (15), corresponding to the extremizer of the replica free entropy (12). The points are results of the AMP algorithm run till convergence averaged over 10 instances of size n = 10 4 .…”
Section: Two Neuronsmentioning
confidence: 99%