2019
DOI: 10.48550/arxiv.1910.12686
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Growing axons: greedy learning of neural networks with application to function approximation

Abstract: We propose a new method for learning deep neural network models that is based on a greedy learning approach: we add one basis function at a time, and a new basis function is generated as a non-linear activation function applied to a linear combination of the previous basis functions. Such a method (growing deep neural network by one neuron at a time) allows us to compute much more accurate approximants for several model problems in function approximation.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
6
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(6 citation statements)
references
References 12 publications
0
6
0
Order By: Relevance
“…In theory, this suggests MLPs may be competitive with traditional FEM [Opschoor et al, 2019, He et al, 2018. In practice, however, the training error of MLPs optimized with first-order methods fails to converge at all in large architecture limits [Fokina and Oseledets, 2019]. Partitionof-unity networks (POU-nets) [Lee et al, 2021] repurpose classification architectures to instead partition space and obtain localized polynomial approximation, and have recently demonstrated hpconvergence during training.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In theory, this suggests MLPs may be competitive with traditional FEM [Opschoor et al, 2019, He et al, 2018. In practice, however, the training error of MLPs optimized with first-order methods fails to converge at all in large architecture limits [Fokina and Oseledets, 2019]. Partitionof-unity networks (POU-nets) [Lee et al, 2021] repurpose classification architectures to instead partition space and obtain localized polynomial approximation, and have recently demonstrated hpconvergence during training.…”
Section: Introductionmentioning
confidence: 99%
“…The current approach adopts the probabilistic viewpoint with the aim of improving approximation, while the previously cited works generally focus on quantifying uncertainty. In a deterministic context several works have pursued other strategies to realize convergence in deep networks [He et al, 2018, Cyr et al, 2020, Adcock and Dexter, 2021, Fokina and Oseledets, 2019, Ainsworth and Dong, 2021. In the context of ML for reduced basis construction, several works have focused primarily on using either Gaussian processes and PCA [Guo and Hesthaven, 2018] or classical/variational autoencoders as replacements for PCA Carlberg, 2020, Lopez andAtzberger, 2020] in classical ROM schemes; this is distinct from the control volume type surrogates considered in which requires a reduced basis corresponding to a partition of space.…”
Section: Introductionmentioning
confidence: 99%
“…The variable projection method has been used for shallow (one hidden layer) neural networks in Pereyra et al (2006). A LS approach was also used in a greedy algorithm to generate adaptive basis elements by Fokina and Oseledets (2019).…”
Section: Hybrid Least Squares/gd Training Approachmentioning
confidence: 99%
“…Optimizers are susceptible to finding suboptimal local minima of loss functionals, and as a result DNN regression typically stagnates after achieving only a few digits of accuracy. For example, using the aforementioned architecture of Yarotsky (2017) for the deep ReLU emulator of x → x 2 , but with random initial weights, Fokina and Oseledets (2019) showed that training with stochastic gradient descent to approximate x → x 2 fails to demonstrate a significant improvement in error with depth, let alone exponential convergence with the number of layers. Lu et al (2018Lu et al ( , 2019 demonstrate consistent failure of deep ReLU networks to approximate the function |x| on [−1, 1] due to gradient death at initialization.…”
Section: Introductionmentioning
confidence: 99%
“…DNNs possess attractive properties: potentially exponential convergence, breaking of the curse-of-dimensionality, and an ability to handle data sampled from function spaces with limited regularity, such as shock and contact discontinuities [1,2,3,4,5,6,7,8,9]. Practically however, challenges regarding the training of DNNs often prevent the realization of convergent schemes for forward problems [10,11,12,13]. For inverse problems however, a number of methods have emerged that train neural networks to simultaneously match target data while minimizing a PDE residual [14,15,16], which have found application across a wide range of applied mathematics problems [17,18,19,20,21].…”
Section: Introductionmentioning
confidence: 99%