How to train your MAML

Antoniou, Antreas; Edwards, Harrison; Storkey, Amos

doi:10.48550/arxiv.1810.09502

Cited by 95 publications

(137 citation statements)

References 15 publications

Supporting

Mentioning

128

Contrasting

Order By: Relevance

“…Other approaches have also used learnable learning rates for multiple steps, e.g. MAML++ (Antoniou et al, 2018) uses per-layer (as opposed to per-parameter) and per-step learning rates. The learning rates are initialised to U [0.005, 0.1] and clipped at (0, 1).…”

Section: A2 Siren and Modulationsmentioning

confidence: 99%

From data to functa: Your data point is a function and you can treat it like one

Dupont¹,

Kim²,

Eslami³

et al. 2022

Preprint

View full text Add to dashboard Cite

It is common practice in deep learning to represent a measurement of the world on a discrete grid, e.g. a 2D grid of pixels. However, the underlying signal represented by these measurements is often continuous, e.g. the scene depicted in an image. A powerful continuous alternative is then to represent these measurements using an implicit neural representation, a neural function trained to output the appropriate measurement value for any input spatial location. In this paper, we take this idea to its next level: what would it take to perform deep learning on these functions instead, treating them as data? In this context we refer to the data as functa, and propose a framework for deep learning on functa. This view presents a number of challenges around efficient conversion from data to functa, compact representation of functa, and effectively solving downstream tasks on functa. We outline a recipe to overcome these challenges and apply it to a wide range of data modalities including images, 3D shapes, neural radiance fields (NeRF) and data on manifolds. We demonstrate that this approach has various compelling properties across data modalities, in particular on the canonical tasks of generative modeling, data imputation, novel view synthesis and classification.

show abstract

Section: A2 Siren and Modulationsmentioning

confidence: 99%

From data to functa: Your data point is a function and you can treat it like one

Dupont¹,

Kim²,

Eslami³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…However, training of MAML can become unstable when there is even a tiny change in the neural network structure. With this observation, the authors in [6] proposed MAML++ algorithm which contains schemes for stabilizing the training. Another challenge is that training MAML involves second derivatives when conducting backpropagation, which increases the computational cost.…”

Section: B Related Workmentioning

confidence: 99%

Dynamic Channel Access via Meta-Reinforcement Learning

Lu¹,

Gursoy²

2022

Preprint

View full text Add to dashboard Cite

In this paper, we address the channel access problem in a dynamic wireless environment via meta-reinforcement learning. Spectrum is a scarce resource in wireless communications, especially with the dramatic increase in the number of devices in networks. Recently, inspired by the success of deep reinforcement learning (DRL), extensive studies have been conducted in addressing wireless resource allocation problems via DRL. However, training DRL algorithms usually requires a massive amount of data collected from the environment for each specific task and the well-trained model may fail if there is a small variation in the environment. In this work, in order to address these challenges, we propose a meta-DRL framework that incorporates the method of Model-Agnostic Meta-Learning (MAML). In the proposed framework, we train a common initialization for similar channel selection tasks. From the initialization, we show that only a few gradient descents are required for adapting to different tasks drawn from the same distribution. We demonstrate the performance improvements via simulation results.

show abstract

“…2) Memory networks (Munkhdalai & Yu, 2017;Santoro et al, 2016;Oreshkin et al, 2018;Mishra et al, 2017), which focus on learning to store "experience" from previously observed tasks in the interest of generalizing to newer tasks. 3) Gradient based meta-learning methods (Finn et al, 2017;Antoniou et al, 2018;Ravi & Larochelle, 2017;Grant et al, 2018;Zhang et al, 2018;Sun et al, 2019) which aim to meta-learn a model in the outer loop that is used as a starting point in the inner loop for a new few-shot task. The PLATINUM framework embeds semi-supervision for gradient descent based methods that use an outer-inner bi-level optimization.…”

Section: Related Workmentioning

confidence: 99%

PLATINUM: Semi-Supervised Model Agnostic Meta-Learning using Submodular Mutual Information

Li¹,

Kothawade²,

Chen³

et al. 2022

Preprint

View full text Add to dashboard Cite

Few-shot classification (FSC) requires training models using a few (typically one to five) data points per class. Meta-learning has proven to be able to learn a parametrized model for FSC by training on various other classification tasks. In this work, we propose PLATINUM (semi-suPervised modeL Agnostic meTa learnIng usiNg sUbmodular Mutual information ), a novel semi-supervised model agnostic meta learning framework that uses the submodular mutual information (SMI) functions to boost the performance of FSC. PLATINUM leverages unlabeled data in the inner and outer loop using SMI functions during meta-training and obtains richer metalearned parameterizations. We study the performance of PLATINUM in two scenarios -1) where the unlabeled data points belong to the same set of classes as the labeled set of a certain episode, and 2) where there exist out-ofdistribution classes that do not belong to the labeled set. We evaluate our method on various settings on the miniImageNet, tieredImageNet and CIFAR-FS datasets. Our experiments show that PLATINUM outperforms MAML and semisupervised approaches like pseduo-labeling for semi-supervised FSC, especially for small ratio of labeled to unlabeled samples.

show abstract

How to train your MAML

Cited by 95 publications

References 15 publications

From data to functa: Your data point is a function and you can treat it like one

From data to functa: Your data point is a function and you can treat it like one

Dynamic Channel Access via Meta-Reinforcement Learning

PLATINUM: Semi-Supervised Model Agnostic Meta-Learning using Submodular Mutual Information

Contact Info

Product

Resources

About