Optimizing loss functions through multi-variate taylor polynomial parameterization

González, Santiago; Miikkulainen, Risto

doi:10.1145/3449639.3459277

Cited by 22 publications

(24 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Metalearning, aka learning to learn, and AutoML have been applied for a wide variety of purposes as summarised in [17,21]. Of particular relevance is meta-learning of loss functions, which has been studied for various purposes including providing differentiable surrogates of nondifferentiable objectives [19], optimising efficiency and asymptotic performance of learning [22,4,18,48,11,12], and improving robustness to train/test domain-shift [3,30]. We are interested in learning white-box losses -i.e., those that can be expressed a short human-readable parametric equation -for efficiency and improved task-transferability compared to neural network alternatives [4,18,3,30], which tend to be less interpretable and need to be learned taskspecifically.…”

Section: Meta-learning Automl and Loss Learningmentioning

confidence: 99%

“…We are interested in learning white-box losses -i.e., those that can be expressed a short human-readable parametric equation -for efficiency and improved task-transferability compared to neural network alternatives [4,18,3,30], which tend to be less interpretable and need to be learned taskspecifically. Meta-learning of white-box model components has been demonstrated for optimisers [47], activation functions [35], neural architectures [43] and losses for accelerating conventional supervised learning [11,12]. We are the first to demonstrate the value of automatic loss function discovery for general purpose label-noise robust learning.…”

Section: Meta-learning Automl and Loss Learningmentioning

confidence: 99%

See 1 more Smart Citation

Searching for Robustness: Loss Learning for Noisy Classification Tasks

Gao

Gouk

Hospedales

2021

2021 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

We present a "learning to learn" approach for discovering white-box classification loss functions that are robust to label noise in the training data. We parameterise a flexible family of loss functions using Taylor polynomials, and apply evolutionary strategies to search for noise-robust losses in this space. To learn re-usable loss functions that can apply to new tasks, our fitness function scores their performance in aggregate across a range of training datasets and architectures. The resulting white-box loss provides a simple and fast "plug-and-play" module that enables effective labelnoise-robust learning in diverse downstream tasks, without requiring a special training procedure or network architecture. The efficacy of our loss is demonstrated on a variety of datasets with both synthetic and real label noise, where we compare favourably to prior work.

show abstract

Section: Meta-learning Automl and Loss Learningmentioning

confidence: 99%

Section: Meta-learning Automl and Loss Learningmentioning

confidence: 99%

Searching for Robustness: Loss Learning for Noisy Classification Tasks

Gao

Gouk

Hospedales

2021

2021 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

show abstract

“…TaylorGLO parameterization represents a loss function as a modified third-degree Taylor polynomial. Such a parameterization has many desirable properties, such as smoothness and continuity, that make it amenable for evolution [8]. In TaylorGAN, there are three functions that need to be optimized jointly (using the notation described in Table 1):…”

Section: The Taylorgan Approachmentioning

confidence: 99%

“…In this paper, such a technique is developed to evolve entirely new GAN formulations that outperform the standard Wasserstein loss. Leveraging the TaylorGLO loss-function parameterization approach [8], separate loss functions are constructed for the two GAN networks. A genetic algorithm is then used to optimize their parameters against two non-differentiable objectives.…”

Section: Introductionmentioning

confidence: 99%

Evolving GAN Formulations for Higher Quality Image Synthesis

González¹,

Kant²,

Miikkulainen³

2021

Preprint

Self Cite

View full text Add to dashboard Cite

Generative Adversarial Networks (GANs) have extended deep learning to complex generation and translation tasks across different data modalities. However, GANs are notoriously difficult to train: Mode collapse and other instabilities in the training process often degrade the quality of the generated results, such as images. This paper presents a new technique called TaylorGAN for improving GANs by discovering customized loss functions for each of its two networks. The loss functions are parameterized as Taylor expansions and optimized through multiobjective evolution. On an image-to-image translation benchmark task, this approach qualitatively improves generated image quality and quantitatively improves two independent GAN performance metrics. It therefore forms a promising approach for applying GANs to more challenging tasks in the future.

show abstract

“…Moreover, we search over programs, which include non-neural operations and data structures, rather than just neural-network architectures, and decide what loss functions to use for training. Our work also resembles work in the AutoML community (Hutter et al, 2018) that searches in a space of programs, for example in the case of SAT solving (KhudaBukhsh et al, 2009) or auto-sklearn (Feurer et al, 2015) and concurrent work on learning loss functions to replace cross-entropy for training a fixed architecture on MNIST and CIFAR (Gonzalez & Miikkulainen, 2019;2020). Although we took inspiration from ideas in that community (Jamieson & Talwalkar, 2016;Li et al, 2016), our algorithms specify both how to compute their outputs and their own optimization objectives in order to work well in synchrony with an expensive deep RL algorithm.…”

Section: Related Workmentioning

confidence: 99%

Meta-learning curiosity algorithms

Alet¹,

Schneider²,

Lozano-Pérez³

et al. 2020

Preprint

View full text Add to dashboard Cite

We hypothesize that curiosity is a mechanism found by evolution that encourages meaningful exploration early in an agent's life in order to expose it to experiences that enable it to obtain high rewards over the course of its lifetime. We formulate the problem of generating curious behavior as one of meta-learning: an outer loop will search over a space of curiosity mechanisms that dynamically adapt the agent's reward signal, and an inner loop will perform standard reinforcement learning using the adapted reward signal. However, current meta-RL methods based on transferring neural network weights have only generalized between very similar tasks. To broaden the generalization, we instead propose to meta-learn algorithms: pieces of code similar to those designed by humans in ML papers. Our rich language of programs combines neural networks with other building blocks such as buffers, nearest-neighbor modules and custom loss functions. We demonstrate the effectiveness of the approach empirically, finding two novel curiosity algorithms that perform on par or better than human-designed published curiosity algorithms in domains as disparate as grid navigation with image inputs, acrobot, lunar lander, ant and hopper.

show abstract

Optimizing loss functions through multi-variate taylor polynomial parameterization

Cited by 22 publications

References 34 publications

Searching for Robustness: Loss Learning for Noisy Classification Tasks

Searching for Robustness: Loss Learning for Noisy Classification Tasks

Evolving GAN Formulations for Higher Quality Image Synthesis

Meta-learning curiosity algorithms

Contact Info

Product

Resources

About