2017
DOI: 10.1088/1361-6420/aa9a90
|View full text |Cite
|
Sign up to set email alerts
|

Stable architectures for deep neural networks

Abstract: Deep neural networks have become invaluable tools for supervised machine learning, e.g., classification of text or images. While often offering superior results over traditional techniques and successfully expressing complicated patterns in data, deep architectures are known to be challenging to design and train such that they generalize well to new data. Critical issues with deep architectures are numerical instabilities in derivative-based learning algorithms commonly called exploding or vanishing gradients.… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

5
575
1
1

Year Published

2019
2019
2023
2023

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 494 publications
(597 citation statements)
references
References 47 publications
5
575
1
1
Order By: Relevance
“…For image deblurring, images computed with too small values of T remain blurry, while for T > T ringing artifacts are generated and their intensity increase with larger T . For a corrupted image, the associated adjoint state requires the knowledge of the ground truth for the terminal condition (14), which is in general not available. However, Figure 7 shows that the learned average optimal stopping time T yields the smallest expected error.…”
Section: Resultsmentioning
confidence: 99%
“…For image deblurring, images computed with too small values of T remain blurry, while for T > T ringing artifacts are generated and their intensity increase with larger T . For a corrupted image, the associated adjoint state requires the knowledge of the ground truth for the terminal condition (14), which is in general not available. However, Figure 7 shows that the learned average optimal stopping time T yields the smallest expected error.…”
Section: Resultsmentioning
confidence: 99%
“…Deep neural networks are capable of approximating nonlinear dynamical systems as shown in many studies 102,103,106,123 . The general nonlinear dynamical system can be presented by an equation of the form…”
Section: Learning Frameworkmentioning
confidence: 99%
“…The authors adopted the discretize-then-differentiate viewpoint on the parameter estimation problem and suggested symplectic numerical integration in order to achieve better stability. As mentioned above, our work contrasts in that inference is always exact during learning, unlike the more involved architecture of [HR17] where learning is based on approximate inference. Furthermore, in our case, symplectic numerical integration is a consequence of making the diagram of Figure 2.2 (page 8) commute.…”
mentioning
confidence: 99%