A note on the expressive power of deep rectified linear unit networks in high‐dimensional spaces

Chen, Liang; Wu, Congwei

doi:10.1002/mma.5575

Cited by 22 publications

(24 citation statements)

References 8 publications

(15 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This gives some intuition that in many cases the above upper bound is not a tight one. We note more recent works such as [4], [24] and the references contained there in, show some of the new developments on these size bounds. Moreover, we can easily obtain the smaller network size estimate when the Lyapunov has compositional structure or has lower dimensional structure.…”

Section: Propositionmentioning

confidence: 93%

Lyapunov-Net: A Deep Neural Network Architecture for Lyapunov Function Approximation

Gaby

Zhang

2021

Preprint

View full text Add to dashboard Cite

We develop a versatile deep neural network architecture, called Lyapunov-Net, to approximate Lyapunov functions of dynamical systems in high dimensions. Lyapunov-Net guarantees positive definiteness, and thus it can be easily trained to satisfy the negative orbital derivative condition, which only renders a single term in the empirical risk function in practice. This significantly reduces the number of hyper-parameters compared to existing methods. We also provide theoretical justifications on the approximation power of Lyapunov-Net and its complexity bounds. We demonstrate the efficiency of the proposed method on nonlinear dynamical systems involving up to 30 dimensional state spaces, and show that the proposed approach significantly outperforms the state-of-the-art methods.

show abstract

Section: Propositionmentioning

confidence: 93%

Lyapunov-Net: A Deep Neural Network Architecture for Lyapunov Function Approximation

Gaby

Zhang

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Most existing approximation theories for deep neural networks so far focus on the approximation rate in the number of parameters W (Cybenko, 1989;Hornik, Stinchcombe, & White, 1989;Barron, 1993;Liang & Srikant, 2016;Yarotsky, 2017Yarotsky, , 2018Poggio, Mhaskar, Rosasco, Miranda, & Liao, 2017;Weinan & Wang, 2018;Petersen & Voigtlaender, 2018;Chui, Lin, & Zhou, 2018;Nakada & Imaizumi, 2019;Gribonval, Kutyniok, Nielsen, & Voigtlaender, 2019;Gühring, Kutyniok, & Petersen, 2019;Chen, Jiang, Liao, & Zhao, 2019;Li, Lin, & Shen, 2019;Suzuki, 2019;Bao et al, 2019;Opschoor, Schwab, & Zech, 2019;Yarotsky & Zhevnerchuk, 2019;Bölcskei, Grohs, Kutyniok, & Petersen, 2019;Montanelli & Du, 2019;Chen & Wu, 2019;Zhou, 2020;Montanelli & Yang, 2020;Montanelli, Yang, & Du, in press). From the point of view of theoretical difficulty, controlling two variables, N and L, in our theory is more challenging than controlling one variable W in the literature.…”

Section: Approximation Rates In O(n) and O(l) Versus O(w )mentioning

confidence: 99%

“…For example, the exponential convergence was studied for polynomials (Yarotsky, 2017;Montanelli et al, in press;Lu et al, 2020), smooth functions (Montanelli et al, in press;Liang & Srikant, 2016), analytic functions (Weinan & Wang, 2018), and functions admitting a holomorphic extension to a Bernstein polyellipse (Opschoor et al, 2019). For another example, no curse of dimensionality occurs, or the curse is lessened for Barron spaces (Barron, 1993;Weinan et al, 2019;Weinan & Wojtowytsch, 2020), Korobov spaces (Montanelli & Du, 2019), band-limited functions (Chen & Wu, 2019;Montanelli et al, in press), compositional functions (Poggio et al, 2017), and smooth functions (Yarotsky & Zhevnerchuk, 2019;Lu et al, 2020;Montanelli & Yang, 2020;Yang & Wang, 2020).…”

Section: Further Interpretation Of Our Theorymentioning

confidence: 99%

“…In (Barron, 1993) and its variants or generalizations (Weinan et al, 2019;Weinan & Wojtowytsch, 2020;Chen & Wu, 2019;Montanelli et al, in press), d-dimensional functions defined on a domain ⊆ R d admitting an integral representation with an integrand as a ridge function on ⊆ R d with a variable coefficent were considered, for example,…”

Section: Curse Of Dimensionalitymentioning

confidence: 99%

See 1 more Smart Citation

Deep Network With Approximation Error Being Reciprocal of Width to Power of Square Root of Depth

2021

View full text Add to dashboard Cite

A new network with super-approximation power is introduced. This network is built with Floor ([Formula: see text]) or ReLU ([Formula: see text]) activation function in each neuron; hence, we call such networks Floor-ReLU networks. For any hyperparameters [Formula: see text] and [Formula: see text], we show that Floor-ReLU networks with width [Formula: see text] and depth [Formula: see text] can uniformly approximate a Hölder function [Formula: see text] on [Formula: see text] with an approximation error [Formula: see text], where [Formula: see text] and [Formula: see text] are the Hölder order and constant, respectively. More generally for an arbitrary continuous function [Formula: see text] on [Formula: see text] with a modulus of continuity [Formula: see text], the constructive approximation rate is [Formula: see text]. As a consequence, this new class of networks overcomes the curse of dimensionality in approximation power when the variation of [Formula: see text] as [Formula: see text] is moderate (e.g., [Formula: see text] for Hölder continuous functions), since the major term to be considered in our approximation rate is essentially [Formula: see text] times a function of [Formula: see text] and [Formula: see text] independent of [Formula: see text] within the modulus of continuity.

show abstract

“…Moreover, for anyx ∈ [−M − 1, M + 1], g δ (x) ⇉ x 2 + M +1 2 ⋅ x M +1 = ReLU(x) as δ → 0 + . Define g δ (x) ∶=g δ (x) −g δ (x − η 0 ) η 0 for any x ∈ R.Clearly,g δ ∈ Hσ(10, 4) implies g δ ∈ Hσ(20,4). For anyx ∈ [−M, M ], we have x, x − η 0 ∈ [−M − 1, M + 1], implying g δ (x) =g δ (x) −g δ (x − η 0 ) η 0 ⇉ ReLU(x) − ReLU(x − η 0 ) η 0 = g(x) as δ → 0 + .…”

mentioning

confidence: 99%

Deep Network Approximation Characterized by Number of Neurons

Shen¹

2020

CICP

View full text Add to dashboard Cite

This paper develops simple feed-forward neural networks that achieve the universal approximation property for all continuous functions with a fixed finite number of neurons. These neural networks are simple because they are designed with a simple and computable continuous activation function σ leveraging a triangularwave function and a softsign function. We prove that σ-activated networks with width 36d(2d + 1) and depth 11 can approximate any continuous function on a d-dimensioanl hypercube within an arbitrarily small error. Hence, for supervised learning and its related regression problems, the hypothesis space generated by these networks with a size not smaller than 36d(2d + 1) × 11 is dense in the space of continuous functions. Furthermore, classification functions arising from image and signal classification are in the hypothesis space generated by σ-activated networks with width 36d(2d + 1) and depth 12, when there exist pairwise disjoint closed bounded subsets of R d such that the samples of the same class are located in the same subset.

show abstract

A note on the expressive power of deep rectified linear unit networks in high‐dimensional spaces

Abstract: We investigate the ability of deep deep rectified linear unit (ReLU) networks to approximate multivariate functions. Specially, we establish the approximation error estimate on a class of bandlimited functions; in this case, ReLU networks can overcome the “curse of dimensionality.”

Cited by 22 publications

References 8 publications

Lyapunov-Net: A Deep Neural Network Architecture for Lyapunov Function Approximation

Lyapunov-Net: A Deep Neural Network Architecture for Lyapunov Function Approximation

Deep Network With Approximation Error Being Reciprocal of Width to Power of Square Root of Depth

Deep Network Approximation Characterized by Number of Neurons

Contact Info

Product

Resources

About