2021
DOI: 10.48550/arxiv.2104.05025
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

New Insights on Reducing Abrupt Representation Change in Online Continual Learning

Abstract: We study the online continual learning paradigm, where agents must learn from a changing distribution with constrained memory and compute. Previous work often tackle catastrophic forgetting by overcoming changes in the space of model parameters. In this work we instead focus on the change in representations of previously observed data due to the introduction of previously unobserved class samples in the incoming data stream. We highlight the issues that arise in the practical setting where new classes must be … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
25
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(26 citation statements)
references
References 21 publications
0
25
0
Order By: Relevance
“…On the one hand, the negative bias towards past classes can be ascribed to the optimization of the cross-entropy loss on examples from the current task. As pointed out in [28], when a new task is presented to the net, an asymmetry arises between the contributions of replay data and current examples to the weights updates: indeed, the gradients of new (and poorly fit) examples outweigh (Fig. 2b).…”
Section: (L2) Der(++) Overemphasize the Classes Of The Current Taskmentioning
confidence: 83%
See 4 more Smart Citations
“…On the one hand, the negative bias towards past classes can be ascribed to the optimization of the cross-entropy loss on examples from the current task. As pointed out in [28], when a new task is presented to the net, an asymmetry arises between the contributions of replay data and current examples to the weights updates: indeed, the gradients of new (and poorly fit) examples outweigh (Fig. 2b).…”
Section: (L2) Der(++) Overemphasize the Classes Of The Current Taskmentioning
confidence: 83%
“…As also observed in other recent works [28], [29], [42], this issue can be mitigated by revising the way the cross-entropy loss is applied during training. Given an example from the current task, we avoid computing the softmax activation on all logits and instead restrict it on those modeling the scores of the current task classes.…”
Section: Preventing Penalization Of Past Classesmentioning
confidence: 87%
See 3 more Smart Citations