2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
DOI: 10.1109/cvpr.2017.753
|View full text |Cite
|
Sign up to set email alerts
|

Expert Gate: Lifelong Learning with a Network of Experts

Abstract: In this paper we introduce a model of lifelong learning, based on a Network of Experts. New tasks / experts are learned and added to the model sequentially, building on what was learned before. To ensure scalability of this process, data from previous tasks cannot be stored and hence is not available when learning a new task. A critical issue in such context, not addressed in the literature so far, relates to the decision which expert to deploy at test time. We introduce a set of gating autoencoders that learn… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

1
298
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
2
2

Relationship

2
7

Authors

Journals

citations
Cited by 415 publications
(310 citation statements)
references
References 32 publications
1
298
0
Order By: Relevance
“…Our proposed method is a data-based approach, but it is different from prior works [3,12,24,33], because their model commonly learns with the task-wise local distillation loss in Eq. (2). We emphasize that local distillation only preserves the knowledge within each of the previous tasks, while global distillation does the knowledge over all tasks.…”
Section: Related Workmentioning
confidence: 99%
“…Our proposed method is a data-based approach, but it is different from prior works [3,12,24,33], because their model commonly learns with the task-wise local distillation loss in Eq. (2). We emphasize that local distillation only preserves the knowledge within each of the previous tasks, while global distillation does the knowledge over all tasks.…”
Section: Related Workmentioning
confidence: 99%
“…For example, if we have learned "lion" and "tiger" earlier, that can help us later in time to learn a "liger" (a rare hybrid cross between a male lion and a female tiger), even with just a few examples. Relating this to point (2) above, this further allows compositional lifelong learning to help recognize new facts (e.g. dog, riding, wave ) based on facts seen earlier in time (e.g.…”
Section: Concepts Of Varying Complexitymentioning
confidence: 95%
“…We argue that the evaluation of LLL methods should be reconsidered. In the standard LLL (with a few notable exceptions, such as [2,4]), the trained models are judged by their capability to recognize each task's categories individually assuming the absence of the categories covered by the remaining tasks -not necessarily realistic. Although the performance of each task in isolation is an important characteristic, it might be deceiving.…”
Section: Concepts Of Varying Complexitymentioning
confidence: 99%
“…Some of the aforementioned methods make use of conditional computation, i.e. the gating network selects a subset of experts to evaluate while others stay idle [51,19,3]. While this is computationally efficient, routing errors can occur, i.e.…”
Section: Related Workmentioning
confidence: 99%