Bayesian Model-Agnostic Meta-Learning

Kim, Taesup; Yoon, Jaesik; Dia, Ousmane; Kim, Sungwoong; Bengio, Yoshua; Ahn, Sungjin

doi:10.48550/arxiv.1806.03836

Cited by 25 publications

(32 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…"Lookahead" methods in deep learning can often be characterized by saving the current state of the model, applying one or more gradient updates to a subset of the parameters, reloading the saved state, and then leveraging the information learned from the future state to modify the current set of parameters. This approach has been used extensively in the metalearning [16,42,7,17,29], optimization [41,21,55,23,24], and recently auxiliary task learning domains [34]. Unlike the above mentioned methods which look into the future to modify optimization processes, our work adapts this central concept to the multi-task learning domain to characterize task interactions and assign tasks to groups of networks.…”

Section: Architectures and Training Dynamicsmentioning

confidence: 99%

Efficiently Identifying Task Groupings for Multi-Task Learning

Fifty¹,

Amid²,

Zhao³

et al. 2021

Preprint

View full text Add to dashboard Cite

Multi-task learning can leverage information learned by one task to benefit the training of other tasks. Despite this capacity, naïvely training all tasks together in one model often degrades performance, and exhaustively searching through combinations of task groupings can be prohibitively expensive. As a result, efficiently identifying the tasks that would benefit from co-training remains a challenging design question without a clear solution. In this paper, we suggest an approach to select which tasks should train together in multi-task learning models. Our method determines task groupings in a single training run by co-training all tasks together and quantifying the effect to which one task's gradient would affect another task's loss. On the large-scale Taskonomy computer vision dataset, we find this method can decrease test loss by 10.0% compared to simply training all tasks together while operating 11.6 times faster than a state-of-the-art task grouping method.Preprint. Under review.

show abstract

Section: Architectures and Training Dynamicsmentioning

confidence: 99%

Efficiently Identifying Task Groupings for Multi-Task Learning

Fifty¹,

Amid²,

Zhao³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Many variants have emerged to balance generalization and customization in a task-adaptive manner. To begin with a generalization perspective, [13,27] suggested probabilistic extensions through the hierarchical Bayesian model and Stein variational gradient descent (SVGD) [37]. In addition, [49] conducted the inner-loop on the low-dimensional latent embedding space, and [70] proposed the meta-regularization that was built on information theory.…”

Section: Related Workmentioning

confidence: 99%

Meta-learning Amidst Heterogeneity and Ambiguity

Go¹,

Yun²

2021

Preprint

View full text Add to dashboard Cite

Meta-learning aims to learn a model that can handle multiple tasks generated from an unknown but shared distribution. However, typical meta-learning algorithms have assumed the tasks to be similar such that a single meta-learner is sufficient to aggregate the variations in all aspects. In addition, there has been less consideration on uncertainty when limited information is given as context. In this paper, we devise a novel meta-learning framework, called Meta-learning Amidst Heterogeneity and Ambiguity (MAHA), that outperforms previous works in terms of prediction based on its ability on task identification. By extensively conducting several experiments in regression and classification, we demonstrate the validity of our model, which turns out to be robust to both task heterogeneity and ambiguity.

show abstract

“…(17) Finally, the hyper-posterior q W F EM −GP (θ|D 1:N ) is used in lieu of the corresponding PACOH-GP hyperposterior in (14) in order to define the predictive distribution.…”

Section: Transfer Meta-learning the Gp Priormentioning

confidence: 99%

“…In practice, for both PACOH-GP and WFEM-GP, the expectations in the predictive distributions (14) need to be approximated. As detailed in the supplementary materials A, this can be done by evaluating the maximum of the hyperposteriors, and by plugging this value into the predictive distribution p θ (t(x) = t|D) -an approach we refer to as maximum a posteriori (MAP).…”

Section: Transfer Meta-learning the Gp Priormentioning

confidence: 99%

“…Bayesian approaches to meta-learning have become increasingly popular in the recent years due to their important advantages in quantifying uncertainty and model selection [14,15]. In this context, both parametric methods such as Bayesian neural networks and non-parametric methods like GPs have both been successfully applied to real-world applications.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Transfer Bayesian Meta-Learning Via Weighted Free Energy Minimization

Yunchuan

Jose

Simeone

2021

2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)

View full text Add to dashboard Cite

Meta-learning optimizes the hyperparameters of a training procedure, such as its initialization, kernel, or learning rate, based on data sampled from a number of auxiliary tasks.A key underlying assumption is that the auxiliary tasksknown as meta-training tasks -share the same generating distribution as the tasks to be encountered at deployment time -known as meta-test tasks. This may, however, not be the case when the test environment differ from the meta-training conditions. To address shifts in task generating distribution between meta-training and meta-testing phases, this paper introduces weighted free energy minimization (WFEM) for transfer meta-learning. We instantiate the proposed approach for non-parametric Bayesian regression and classification via Gaussian Processes (GPs). The method is validated on a toy sinusoidal regression problem, as well as on classification using miniImagenet and CUB data sets, through comparison with standard metalearning of GP priors as implemented by PACOH.

show abstract

Bayesian Model-Agnostic Meta-Learning

Cited by 25 publications

References 16 publications

Efficiently Identifying Task Groupings for Multi-Task Learning

Efficiently Identifying Task Groupings for Multi-Task Learning

Meta-learning Amidst Heterogeneity and Ambiguity

Transfer Bayesian Meta-Learning Via Weighted Free Energy Minimization

Contact Info

Product

Resources

About