Improved Training Speed, Accuracy, and Data Utilization Through Loss Function Optimization

González, Santiago F.; Miikkulainen, Risto

doi:10.48550/arxiv.1905.11528

Cited by 8 publications

(17 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Genetic Loss Optimization (GLO) [4] provided an initial study into metalearning of loss functions. GLO is based on a two-phase approach that (1) evolves a function structure using a tree representation, and (2) optimizes a structure's coefficients using an evolutionary strategy.…”

Section: A Loss Function Metalearningmentioning

confidence: 99%

See 1 more Smart Citation

Optimizing loss functions through multi-variate taylor polynomial parameterization

González

Miikkulainen

2021

Proceedings of the Genetic and Evolutionary Computation Conference

Self Cite

View full text Add to dashboard Cite

Loss function optimization for neural networks has recently emerged as a new direction for metalearning, with Genetic Loss Optimization (GLO) providing a general approach for the discovery and optimization of such functions. GLO represents loss functions as trees that are evolved and further optimized using evolutionary strategies. However, searching in this space is difficult because most candidates are not valid loss functions. In this paper, a new technique, Multivariate Taylor expansion-based genetic loss-function optimization (TaylorGLO), is introduced to solve this problem. It represents functions using a novel parameterization based on Taylor expansions, making the search more effective. TaylorGLO is able to find new loss functions that outperform those found by GLO in many fewer generations, demonstrating that loss function optimization is a productive avenue for metalearning.

show abstract

Section: A Loss Function Metalearningmentioning

confidence: 99%

“…In doing so, it makes it possible to regularize the solutions automatically. Genetic Loss Optimization (GLO) [4] provided an initial implementation of this idea using a combination of genetic programming and evolutionary strategies.…”

Section: Introductionmentioning

confidence: 99%

Optimizing loss functions through multi-variate taylor polynomial parameterization

González

Miikkulainen

2021

Proceedings of the Genetic and Evolutionary Computation Conference

Self Cite

View full text Add to dashboard Cite

show abstract

“…In this regard, a recent research direction is concerned with loss function meta-learning, with diverse applications in supervised and reinforcement learning [18][19][20][21][22][23][24][25][26][27][28][29][30][31]. Although different works utilize different meta-learning techniques and have different goals, it has been shown that loss functions obtained via meta-learning can lead to an improved convergence of the gradient-descent-based optimization.…”

Section: Related Work and Motivationmentioning

confidence: 99%

Meta-learning PINN loss functions

Psaros,

Kawaguchi,

Karniadakis

2021

Preprint

View full text Add to dashboard Cite

We propose a meta-learning technique for offline discovery of physics-informed neural network (PINN) loss functions. We extend earlier works on meta-learning, and develop a gradient-based meta-learning algorithm for addressing diverse task distributions based on parametrized partial differential equations (PDEs) that are solved with PINNs. Furthermore, based on new theory we identify two desirable properties of meta-learned losses in PINN problems, which we enforce by proposing a new regularization method or using a specific parametrization of the loss function. In the computational examples, the meta-learned losses are employed at test time for addressing regression and PDE task distributions. Our results indicate that significant performance improvement can be achieved by using a shared-among-tasks offline-learned loss function even for out-of-distribution meta-testing. In this case, we solve for test tasks that do not belong to the task distribution used in meta-training, and we also employ PINN architectures that are different from the PINN architecture used in meta-training. To better understand the capabilities and limitations of the proposed method, we consider various parametrizations of the loss function and describe different algorithm design options and how they may affect meta-learning performance.

show abstract

“…Meta-learning and Loss Learning Meta-learning, also known as learning to learn, has been applied for a wide variety of purposes as summarized in [12]. Of particular relevance is meta-learning of loss functions, which has been studied for various purposes including providing differentiable surrogates of non-differentiable objectives [14], optimizing efficiency and asymptotic performance of learning [17,2,13,39,7,8], and improving robustness to train/test domain-shift [1,24]. We are particularly interested in learning white-box losses for efficiency and improved task-transferability compared to neural network alternatives [2,13,1,24].…”

Section: Related Workmentioning

confidence: 99%

“…We are particularly interested in learning white-box losses for efficiency and improved task-transferability compared to neural network alternatives [2,13,1,24]. Meta-learning of white-box learner components has been demonstrated for optimizers [38], activation functions [28] and losses for accelerating conventional supervised learning [7,8]. We are the first to demonstrate the value of automatic loss function discovery for general purpose label-noise robust learning.…”

Section: Related Workmentioning

confidence: 99%

Searching for Robustness: Loss Learning for Noisy Classification Tasks

Gao

Gouk

Hospedales

2021

Preprint

View full text Add to dashboard Cite

We present a "learning to learn" approach for automatically constructing white-box classification loss functions that are robust to label noise in the training data. We parameterize a flexible family of loss functions using Taylor polynomials, and apply evolutionary strategies to search for noise-robust losses in this space. To learn re-usable loss functions that can apply to new tasks, our fitness function scores their performance in aggregate across a range of training dataset and architecture combinations. The resulting white-box loss provides a simple and fast "plug-andplay" module that enables effective noise-robust learning in diverse downstream tasks, without requiring a special training procedure or network architecture. The efficacy of our method is demonstrated on a variety of datasets with both synthetic and real label noise, where we compare favorably to previous work.

show abstract

Improved Training Speed, Accuracy, and Data Utilization Through Loss Function Optimization

Cited by 8 publications

References 19 publications

Optimizing loss functions through multi-variate taylor polynomial parameterization

Optimizing loss functions through multi-variate taylor polynomial parameterization

Meta-learning PINN loss functions

Searching for Robustness: Loss Learning for Noisy Classification Tasks

Contact Info

Product

Resources

About