2021
DOI: 10.48550/arxiv.2102.10492
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Deep ReLU Networks Preserve Expected Length

Boris Hanin,
Ryan Jeong,
David Rolnick

Abstract: Assessing the complexity of functions computed by a neural network helps us understand how the network will learn and generalize. One natural measure of complexity is how the network distorts length -if the network takes a unit-length curve as input, what is the length of the resulting curve of outputs? It has been widely believed that this length grows exponentially in network depth. We prove that in fact this is not the case: the expected length distortion does not grow with depth, and indeed shrinks slightl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
1

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(11 citation statements)
references
References 7 publications
0
11
0
Order By: Relevance
“…The conclusion in Theorem 2.2 is not new, having been obtained many times and under a variety of different assumptions (including for more general architectures) [31,49,56,68,82]. We refer the interested reader to [31] for a discussion of prior work and note only that convergence of the derivatives of the field z (L+1) α to its Gaussian limit does not seem to have been previously considered. We give a short proof that includes convergence of derivatives along the lines of the arguments in [31,49] in Appendix §A.…”
Section: Neural Network-centric Motivationsmentioning
confidence: 91%
See 2 more Smart Citations
“…The conclusion in Theorem 2.2 is not new, having been obtained many times and under a variety of different assumptions (including for more general architectures) [31,49,56,68,82]. We refer the interested reader to [31] for a discussion of prior work and note only that convergence of the derivatives of the field z (L+1) α to its Gaussian limit does not seem to have been previously considered. We give a short proof that includes convergence of derivatives along the lines of the arguments in [31,49] in Appendix §A.…”
Section: Neural Network-centric Motivationsmentioning
confidence: 91%
“…We refer the interested reader to [31] for a discussion of prior work and note only that convergence of the derivatives of the field z (L+1) α to its Gaussian limit does not seem to have been previously considered. We give a short proof that includes convergence of derivatives along the lines of the arguments in [31,49] in Appendix §A.…”
Section: Neural Network-centric Motivationsmentioning
confidence: 99%
See 1 more Smart Citation
“…Based on these notions, Murray et al (2022) studies how to avoid rapid convergence of pairwise input correlations, vanishing and exploding gradients. However, Hanin et al (2021) proved that for a ReLU network with He initialization the length of the curve does not grow with the depth and even shrinks slightly. We establish similar results for maxout networks.…”
Section: Introductionmentioning
confidence: 99%
“…deep networks, how operations (linear transformations and non-linear activations) are connected and stacked together is vital, which is studied in network's convergence (Du et al, 2019;Zhou et al, 2020;Zou et al, 2020b), complexity (Poole et al, 2016;Rieck et al, 2018;Hanin et al, 2021), generalization (Chen et al, 2019b;Cao & Gu, 2019;Xiao et al, 2019), loss landscapes (Li et al, 2017;Fort & Jastrzebski, 2019;Shevchenko & Mondelli, 2020), etc.…”
Section: Introductionmentioning
confidence: 99%