2022
DOI: 10.48550/arxiv.2205.09029
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 0 publications
0
2
0
Order By: Relevance
“…In contrast, the learning still succeeds numerically, as any noise will perturb the dynamics off the saddle point, allowing learning to proceed (figure 6(A)). However, the dynamics still slow in the vicinity of the saddle point, providing a theoretical explanation for catastrophic slowing in deep linear networks (Lee et al 2022). We note that the analytical…”
Section: Continualmentioning
confidence: 81%
See 1 more Smart Citation
“…In contrast, the learning still succeeds numerically, as any noise will perturb the dynamics off the saddle point, allowing learning to proceed (figure 6(A)). However, the dynamics still slow in the vicinity of the saddle point, providing a theoretical explanation for catastrophic slowing in deep linear networks (Lee et al 2022). We note that the analytical…”
Section: Continualmentioning
confidence: 81%
“…When simulated numerically, the learning dynamics escape the saddle points due to imprecision of floating point arithmetic. However, numerical optimisation still suffers from catastrophic slowing (Lee et al 2022), as escaping the saddle point takes time (figure 6(A)). In contrast, in the case of aligned singular vectors (c = 0), we recover the equation for the temporal dynamics as described in Saxe et al (2014).…”
Section: J Stat Mech (2023) 114004mentioning
confidence: 99%