2019
DOI: 10.48550/arxiv.1912.04862
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint

Abstract: Motivated by the gap between theoretical optimal approximation rates of deep neural networks (DNNs) and the accuracy realized in practice, we seek to improve the training of DNNs. The adoption of an adaptive basis viewpoint of DNNs leads to novel initializations and a hybrid least squares/gradient descent optimizer. We provide analysis of these techniques and illustrate via numerical examples dramatic increases in accuracy and convergence rate for benchmarks characterizing scientific applications where DNNs ar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
6

Relationship

2
4

Authors

Journals

citations
Cited by 6 publications
(12 citation statements)
references
References 16 publications
(26 reference statements)
0
12
0
Order By: Relevance
“…While we have presented here a number of techniques to obtain qualitatively correct and physically meaningful solutions, the barrier in achieving convergence of error with respect to neural network size remains a major challenge to obtaining DNN solutions competitive with traditional finite element/volume methods. We refer the interested reader to some of our ongoing work in this area [75].…”
Section: Discussionmentioning
confidence: 99%
“…While we have presented here a number of techniques to obtain qualitatively correct and physically meaningful solutions, the barrier in achieving convergence of error with respect to neural network size remains a major challenge to obtaining DNN solutions competitive with traditional finite element/volume methods. We refer the interested reader to some of our ongoing work in this area [75].…”
Section: Discussionmentioning
confidence: 99%
“…The standard approach to computing optima of objective functions associated with neural networks is to apply a gradient-based optimizer to all of the parameters in θ, for example [1,6,14,16,21,23,24]. In order to solve (2.16), we instead propose a training procedure in the spirit of [4] as follows:…”
Section: Neural Network Approximation Of Augmented Basis Function In ...mentioning
confidence: 99%
“…The learning rate for each basis function ϕ i is α i = 1×10 −2 1.1 i−1 . The hidden parameters are initialized according to the box initialization in [4]. We employ fixed tensor product Gauss-Legendre quadrature rule with 100×100 nodes in order to approximate inner products in the interior of the domain.…”
Section: Beam With Applied Couplementioning
confidence: 99%
“…However, it may also have different domain and image dimensionality based on the structure of network [31,30]. An adaptive basis viewpoint of DNNs is also given in [40].…”
Section: Hp-variational Physics-informed Neural Network (Hp-vpinn)mentioning
confidence: 99%