Growing axons: greedy learning of neural networks with application to function approximation

Fokina, Daria; Oseledets, Ivan

doi:10.48550/arxiv.1910.12686

Cited by 3 publications

(6 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In theory, this suggests MLPs may be competitive with traditional FEM [Opschoor et al, 2019, He et al, 2018. In practice, however, the training error of MLPs optimized with first-order methods fails to converge at all in large architecture limits [Fokina and Oseledets, 2019]. Partitionof-unity networks (POU-nets) [Lee et al, 2021] repurpose classification architectures to instead partition space and obtain localized polynomial approximation, and have recently demonstrated hpconvergence during training.…”

Section: Introductionmentioning

confidence: 99%

“…The current approach adopts the probabilistic viewpoint with the aim of improving approximation, while the previously cited works generally focus on quantifying uncertainty. In a deterministic context several works have pursued other strategies to realize convergence in deep networks [He et al, 2018, Cyr et al, 2020, Adcock and Dexter, 2021, Fokina and Oseledets, 2019, Ainsworth and Dong, 2021. In the context of ML for reduced basis construction, several works have focused primarily on using either Gaussian processes and PCA [Guo and Hesthaven, 2018] or classical/variational autoencoders as replacements for PCA Carlberg, 2020, Lopez andAtzberger, 2020] in classical ROM schemes; this is distinct from the control volume type surrogates considered in which requires a reduced basis corresponding to a partition of space.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Probabilistic partition of unity networks: clustering based deep approximation

Trask,

Gulian,

Huang

et al. 2021

Preprint

View full text Add to dashboard Cite

Partition of unity networks (POU-Nets) have been shown capable of realizing algebraic convergence rates for regression and solution of PDEs, but require empirical tuning of training parameters. We enrich POU-Nets with a Gaussian noise model to obtain a probabilistic generalization amenable to gradient-based minimization of a maximum likelihood loss. The resulting architecture provides spatial representations of both noiseless and noisy data as Gaussian mixtures with closed form expressions for variance which provides an estimator of local error. The training process yields remarkably sharp partitions of input space based upon correlation of function values. This classification of training points is amenable to a hierarchical refinement strategy that significantly improves the localization of the regression, allowing for higher-order polynomial approximation to be utilized. The framework scales more favorably to large data sets as compared to Gaussian process regression and allows for spatially varying uncertainty, leveraging the expressive power of deep neural networks while bypassing expensive training associated with other probabilistic deep learning methods. Compared to standard deep neural networks, the framework demonstrates hp-convergence without the use of regularizers to tune the localization of partitions. We provide benchmarks quantifying performance in high/low-dimensions, demonstrating that convergence rates depend only on the latent dimension of data within high-dimensional space. Finally, we introduce a new open-source data set of PDE-based simulations of a semiconductor device and perform unsupervised extraction of a physically interpretable reduced-order basis.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Probabilistic partition of unity networks: clustering based deep approximation

Trask,

Gulian,

Huang

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…The variable projection method has been used for shallow (one hidden layer) neural networks in Pereyra et al (2006). A LS approach was also used in a greedy algorithm to generate adaptive basis elements by Fokina and Oseledets (2019).…”

Section: Hybrid Least Squares/gd Training Approachmentioning

confidence: 99%

“…Optimizers are susceptible to finding suboptimal local minima of loss functionals, and as a result DNN regression typically stagnates after achieving only a few digits of accuracy. For example, using the aforementioned architecture of Yarotsky (2017) for the deep ReLU emulator of x → x 2 , but with random initial weights, Fokina and Oseledets (2019) showed that training with stochastic gradient descent to approximate x → x 2 fails to demonstrate a significant improvement in error with depth, let alone exponential convergence with the number of layers. Lu et al (2018Lu et al ( , 2019 demonstrate consistent failure of deep ReLU networks to approximate the function |x| on [−1, 1] due to gradient death at initialization.…”

Section: Introductionmentioning

confidence: 99%

Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint

Cyr¹,

Gulian²,

Patel³

et al. 2019

Preprint

View full text Add to dashboard Cite

Motivated by the gap between theoretical optimal approximation rates of deep neural networks (DNNs) and the accuracy realized in practice, we seek to improve the training of DNNs. The adoption of an adaptive basis viewpoint of DNNs leads to novel initializations and a hybrid least squares/gradient descent optimizer. We provide analysis of these techniques and illustrate via numerical examples dramatic increases in accuracy and convergence rate for benchmarks characterizing scientific applications where DNNs are currently used, including regression problems and physics-informed neural networks for the solution of partial differential equations.

show abstract

“…DNNs possess attractive properties: potentially exponential convergence, breaking of the curse-of-dimensionality, and an ability to handle data sampled from function spaces with limited regularity, such as shock and contact discontinuities [1,2,3,4,5,6,7,8,9]. Practically however, challenges regarding the training of DNNs often prevent the realization of convergent schemes for forward problems [10,11,12,13]. For inverse problems however, a number of methods have emerged that train neural networks to simultaneously match target data while minimizing a PDE residual [14,15,16], which have found application across a wide range of applied mathematics problems [17,18,19,20,21].…”

Section: Introductionmentioning

confidence: 99%

Thermodynamically consistent physics-informed neural networks for hyperbolic systems

Patel¹,

Manickam²,

Trask³

et al. 2020

Preprint

View full text Add to dashboard Cite

Physics-informed neural network architectures have emerged as a powerful tool for developing flexible PDE solvers which easily assimilate data, but face challenges related to the PDE discretization underpinning them. By instead adapting a least squares space-time control volume scheme, we circumvent issues particularly related to imposition of boundary conditions and conservation while reducing solution regularity requirements. Additionally, connections to classical finite volume methods allows application of biases toward entropy solutions and total variation diminishing properties. For inverse problems, we may impose further thermodynamic biases, allowing us to fit shock hydrodynamics models to molecular simulation of rarefied gases and metals. The resulting data-driven equations of state may be incorporated into traditional shock hydrodynamics codes.

show abstract

Growing axons: greedy learning of neural networks with application to function approximation

Cited by 3 publications

References 12 publications

Probabilistic partition of unity networks: clustering based deep approximation

Probabilistic partition of unity networks: clustering based deep approximation

Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint

Thermodynamically consistent physics-informed neural networks for hyperbolic systems

Contact Info

Product

Resources

About