2018
DOI: 10.48550/arxiv.1810.09502
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

How to train your MAML

Abstract: The field of few-shot learning has recently seen substantial advancements. Most of these advancements came from casting few-shot learning as a meta-learning problem. Model Agnostic Meta Learning or MAML is currently one of the best approaches for few-shot learning via meta-learning. MAML is simple, elegant and very powerful, however, it has a variety of issues, such as being very sensitive to neural network architectures, often leading to instability during training, requiring arduous hyperparameter searches t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
128
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
2
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 95 publications
(137 citation statements)
references
References 15 publications
0
128
0
Order By: Relevance
“…Other approaches have also used learnable learning rates for multiple steps, e.g. MAML++ (Antoniou et al, 2018) uses per-layer (as opposed to per-parameter) and per-step learning rates. The learning rates are initialised to U [0.005, 0.1] and clipped at (0, 1).…”
Section: A2 Siren and Modulationsmentioning
confidence: 99%
“…Other approaches have also used learnable learning rates for multiple steps, e.g. MAML++ (Antoniou et al, 2018) uses per-layer (as opposed to per-parameter) and per-step learning rates. The learning rates are initialised to U [0.005, 0.1] and clipped at (0, 1).…”
Section: A2 Siren and Modulationsmentioning
confidence: 99%
“…However, training of MAML can become unstable when there is even a tiny change in the neural network structure. With this observation, the authors in [6] proposed MAML++ algorithm which contains schemes for stabilizing the training. Another challenge is that training MAML involves second derivatives when conducting backpropagation, which increases the computational cost.…”
Section: B Related Workmentioning
confidence: 99%
“…2) Memory networks (Munkhdalai & Yu, 2017;Santoro et al, 2016;Oreshkin et al, 2018;Mishra et al, 2017), which focus on learning to store "experience" from previously observed tasks in the interest of generalizing to newer tasks. 3) Gradient based meta-learning methods (Finn et al, 2017;Antoniou et al, 2018;Ravi & Larochelle, 2017;Grant et al, 2018;Zhang et al, 2018;Sun et al, 2019) which aim to meta-learn a model in the outer loop that is used as a starting point in the inner loop for a new few-shot task. The PLATINUM framework embeds semi-supervision for gradient descent based methods that use an outer-inner bi-level optimization.…”
Section: Related Workmentioning
confidence: 99%