2018
DOI: 10.48550/arxiv.1806.03836
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Bayesian Model-Agnostic Meta-Learning

Abstract: Due to the inherent model uncertainty, learning to infer Bayesian posterior from a few-shot dataset is an important step towards robust meta-learning. In this paper, we propose a novel Bayesian model-agnostic meta-learning method. The proposed method combines efficient gradient-based meta-learning with nonparametric variational inference in a principled probabilistic framework. Unlike previous methods, during fast adaptation, the method is capable of learning complex uncertainty structure beyond a simple Gauss… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
32
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 25 publications
(32 citation statements)
references
References 16 publications
0
32
0
Order By: Relevance
“…"Lookahead" methods in deep learning can often be characterized by saving the current state of the model, applying one or more gradient updates to a subset of the parameters, reloading the saved state, and then leveraging the information learned from the future state to modify the current set of parameters. This approach has been used extensively in the metalearning [16,42,7,17,29], optimization [41,21,55,23,24], and recently auxiliary task learning domains [34]. Unlike the above mentioned methods which look into the future to modify optimization processes, our work adapts this central concept to the multi-task learning domain to characterize task interactions and assign tasks to groups of networks.…”
Section: Architectures and Training Dynamicsmentioning
confidence: 99%
“…"Lookahead" methods in deep learning can often be characterized by saving the current state of the model, applying one or more gradient updates to a subset of the parameters, reloading the saved state, and then leveraging the information learned from the future state to modify the current set of parameters. This approach has been used extensively in the metalearning [16,42,7,17,29], optimization [41,21,55,23,24], and recently auxiliary task learning domains [34]. Unlike the above mentioned methods which look into the future to modify optimization processes, our work adapts this central concept to the multi-task learning domain to characterize task interactions and assign tasks to groups of networks.…”
Section: Architectures and Training Dynamicsmentioning
confidence: 99%
“…Many variants have emerged to balance generalization and customization in a task-adaptive manner. To begin with a generalization perspective, [13,27] suggested probabilistic extensions through the hierarchical Bayesian model and Stein variational gradient descent (SVGD) [37]. In addition, [49] conducted the inner-loop on the low-dimensional latent embedding space, and [70] proposed the meta-regularization that was built on information theory.…”
Section: Related Workmentioning
confidence: 99%
“…(17) Finally, the hyper-posterior q W F EM −GP (θ|D 1:N ) is used in lieu of the corresponding PACOH-GP hyperposterior in (14) in order to define the predictive distribution.…”
Section: Transfer Meta-learning the Gp Priormentioning
confidence: 99%
“…In practice, for both PACOH-GP and WFEM-GP, the expectations in the predictive distributions (14) need to be approximated. As detailed in the supplementary materials A, this can be done by evaluating the maximum of the hyperposteriors, and by plugging this value into the predictive distribution p θ (t(x) = t|D) -an approach we refer to as maximum a posteriori (MAP).…”
Section: Transfer Meta-learning the Gp Priormentioning
confidence: 99%
See 1 more Smart Citation