2017
DOI: 10.48550/arxiv.1711.01530
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Fisher-Rao Metric, Geometry, and Complexity of Neural Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

2
41
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
7
3

Relationship

0
10

Authors

Journals

citations
Cited by 31 publications
(43 citation statements)
references
References 5 publications
2
41
0
Order By: Relevance
“…This line of work resulted in improved generalization bounds for deep neural networks (e.g. Dziugaite and Roy, 2016;Kawaguchi et al, 2017;Bartlett et al, 2017;Neyshabur et al, 2018Neyshabur et al, , 2017Liang et al, 2017;Arora et al, 2018;Zhou et al, 2019). This work provides further empirical evidence and alludes to more ne-grained analysis.…”
Section: Introductionmentioning
confidence: 99%
“…This line of work resulted in improved generalization bounds for deep neural networks (e.g. Dziugaite and Roy, 2016;Kawaguchi et al, 2017;Bartlett et al, 2017;Neyshabur et al, 2018Neyshabur et al, , 2017Liang et al, 2017;Arora et al, 2018;Zhou et al, 2019). This work provides further empirical evidence and alludes to more ne-grained analysis.…”
Section: Introductionmentioning
confidence: 99%
“…which relates to the Fisher-Rao norm [LPRS17], a measures of model complexity. In addition, weighted 2 regularizer corresponds to anisotropic Gaussian prior on the parameters, which enjoyed empirical success in neural networks [LW17,ZTSG19].…”
Section: Related Workmentioning
confidence: 99%
“…We acknowledge that besides the algorithm-dependent approach that we follow, recent advances in learning theory aim to explain the generalization performance of neural networks from many other perspectives. Some of the most prominent ideas include bounding the network capacity by the norm of weight matrices Neyshabur et al (2015); Liang et al (2017), margin theory Bartlett et al (2017); Wei et al (2018), PAC-Bayesian theory Dziugaite and Roy (2017); ; Dziugaite and Roy (2018), network compressibility Arora et al (2018), and over-parametrization Du et al (2018); Allen-Zhu et al (2018); ; Chizat and Bach (2018). Most of these results are stated in the context of neural networks (some are tailored to networks with specific architecture), whereas our work addresses generalization in non-convex stochastic optimization in general.…”
Section: Related Workmentioning
confidence: 99%