New error bounds for deep networks using sparse grids

Montanelli, Hadrien; Du, Qiang

doi:10.48550/arxiv.1712.08688

Cited by 8 publications

(8 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For these networks, approximation bounds for classes of smooth functions have been established in [61] and for piecewise smooth functions in [47]. Connections to sparse grids and the associated approximation rates were established in [43] and connections to linear finite element approximation were reported in [27]. It was also discovered that the depth of neural networks, i.e., the number of layers, crucially influences the approximation capabilities of these networks in the sense that deeper networks are more efficient approximators [13,42,47,50].…”

Section: Introductionmentioning

confidence: 99%

Error bounds for approximations with deep ReLU neural networks in Ws,p norms

2019

View full text Add to dashboard Cite

We analyze approximation rates of deep ReLU neural networks for Sobolev-regular functions with respect to weaker Sobolev norms. First, we construct, based on a calculus of ReLU networks, artificial neural networks with ReLU activation functions that achieve certain approximation rates. Second, we establish lower bounds for the approximation by ReLU neural networks for classes of Sobolev-regular functions. Our results extend recent advances in the approximation theory of ReLU networks to the regime that is most relevant for applications in the numerical analysis of partial differential equations.   ,which encourages the network to encode information about the derivatives of f in its weights. The authors of [16] call this method Sobolev training and reported reduced generalization errors and better data-efficiency in a network compression task (see [31]) and in application to synthetic gradients (see [34]). In case of network compression, the approximated function f is a function realized by a possibly very large neural network N large (·|w), that has been trained for some supervised learning task and is learnt by a smaller network N small . In contrast to usual supervised learning settings, the approximated function f (·) = N large (·|w) is known and the derivatives can be computed.• Motivated by the performance of deep learning-based solutions in classical machine learning tasks and, in particular, by their ability to overcome the curse of dimension, neural networks are now also applied for the approximative solution of partial differential equations (PDEs) (see [26,36,54,59]).In [54] the authors present their deep Galerkin method for approximating solutions of high-dimensional quasilinear parabolic PDEs. For this, a functional J(f ) encoding the differential operator, boundary conditions, and initial conditions is introduced. A neural network N PDE with weights w is then trained to minimize the functional J(N PDE (w)). This is done by a discretization and randomly sampling spatial points.The theoretical foundation for approximating a function and higher-order derivatives with a neural network was already given in a less known version of the universal approximation theorem by Hornik in [32, Theorem 3]. In particular, it was shown that if the activation function ̺ is k-times continuously differentiable, non-constant, and bounded, then any k-times continuously differentiable function f and its derivatives up to order k can be uniformly approximated by a shallow neural network on compact sets. Note though that the conditions on the activation function are very restrictive and that, for example, the ReLU is not included in the above result. However, in [16], it was shown that the theorem also holds for shallow ReLU networks if k = 1. Theorem 3 in [32] was also used in [54] to show the existence of a shallow network approximating solutions of the PDEs considered in this paper. An important aspect, that is untouched by the previous approximation results is how the complexity of a network and, in particular, its depth...

show abstract

Section: Introductionmentioning

confidence: 99%

Error bounds for approximations with deep ReLU neural networks in Ws,p norms

2019

View full text Add to dashboard Cite

show abstract

“…In addition, we are interested in understanding the SGD method in solving the non-convex optimization problem. The TensorFlow [1] provides an efficient tool to calculate the partial derivatives in (20), which will be used in our implementation.…”

Section: Numerical Examplementioning

confidence: 99%

“…In recent years, deep learning methods have achieved unprecedented successes in various application fields, including computer vision, speech recognition, natural language processing, audio recognition, social network filtering, and bioinformatics, where they have produced results comparable to and in some cases superior to human experts [17,12]. Motivated by these exciting progress, there are increased new research interests in the literature for the application of deep learning methods for scientific computation, including approximating multivariate functions and solving differential equations using the deep neural network; see [13,20,27,28,15,32] and references therein.…”

Section: Introductionmentioning

confidence: 99%

“…In [13], the authors investigate the relationship between deep neural networks with rectified linear unit (ReLU) function as the activation function and continuous piecewise linear functions in the finite element method (FEM). A new error bound for the approximation of multivariate functions using deep ReLU networks is presented in [20], which shows that the curse of the dimensionality is lessened by establishing a connection between the deep networks and sparse grids. In [28] the authors solve Poisson problems and eigenvalue problems in the context of the Ritz method based on representing the trail functions by deep neural networks.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A mesh-free method for interface problems using the deep learning approach

Wang

Zhang

2020

Journal of Computational Physics

View full text Add to dashboard Cite

In this paper, we propose a mesh-free method to solve interface problems using the deep learning approach. Two interface problems are considered. The first one is an elliptic PDE with a discontinuous and high-contrast coefficient. While the second one is a linear elasticity equation with discontinuous stress tensor. In both cases, we formulate the PDEs into variational problems, which can be solved via the deep learning approach. To deal with the inhomogeneous boundary conditions, we use a shallow neuron network to approximate the boundary conditions. Instead of using an adaptive mesh refinement method or specially designed basis functions or numerical schemes to compute the PDE solutions, the proposed method has the advantages that it is easy to implement and mesh-free. Finally, we present numerical results to demonstrate the accuracy and efficiency of the proposed method for interface problems.

show abstract

“…A practical issue of such a convex nonlinear optimization problem is the difficulty in identifying the global minimizer using numerical methods. While this issue is not solved, recent advances in deep learning theory show that the deep neural network (DNN), as a composition of multiple linear transformations and simple nonlinear activation functions, has the capacity of approximating various kinds of functions, overcoming or mitigating the curse of dimensionality [46,14,48,51,67,37,47,23,58]. Besides, it is shown that with over-parametrization and random initialization, the DNN-based least square optimization achieves a global minimizer by gradient descent with a linear convergence rate in both the setting of regression [27,11,68,7,43,40,9,8] and PDE solvers [41,34].…”

Section: Introductionmentioning

confidence: 99%

Stationary Density Estimation of Itô Diffusions Using Deep Learning

Harlim

Liang

et al. 2021

Preprint

View full text Add to dashboard Cite

In this paper, we consider the density estimation problem associated with the stationary measure of ergodic Itô diffusions from a discrete-time series that approximate the solutions of the stochastic differential equations. To take an advantage of the characterization of density function through the stationary solution of a parabolic-type Fokker-Planck PDE, we proceed as follows. First, we employ deep neural networks to approximate the drift and diffusion terms of the SDE by solving appropriate supervised learning tasks. Subsequently, we solve a steady-state Fokker-Plank equation associated with the estimated drift and diffusion coefficients with a neural-network-based least-squares method. We establish the convergence of the proposed scheme under appropriate mathematical assumptions, accounting for the generalization errors induced by regressing the drift and diffusion coefficients, and the PDE solvers. This theoretical study relies on a recent perturbation theory of Markov chain result that shows a linear dependence of the density estimation to the error in estimating the drift term, and generalization error results of nonparametric regression and of PDE regression solution obtained with neural-network models. The effectiveness of this method is reflected by numerical simulations of a two-dimensional Student's t distribution and a 20-dimensional Langevin dynamics. K eywordsStochastic differential equations • Data-driven method • Deep neural network • Fokker-Plank equation

show abstract

New error bounds for deep networks using sparse grids

Cited by 8 publications

References 0 publications

Error bounds for approximations with deep ReLU neural networks in Ws,p norms

Error bounds for approximations with deep ReLU neural networks in Ws,p norms

A mesh-free method for interface problems using the deep learning approach

Stationary Density Estimation of Itô Diffusions Using Deep Learning

Contact Info

Product

Resources

About