2019
DOI: 10.48550/arxiv.1904.12191
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Linearized two-layers neural networks in high dimension

Abstract: We consider the problem of learning an unknown function f on the d-dimensional sphere with respect to the square loss, given i.i.d. samples {(y i , x i )} i≤n where x i is a feature vector uniformly distributed on the sphere and y i = f (x i ) + ε i . We study two popular classes of models that can be regarded as linearizations of two-layers neural networks around a random initialization: the random features model of Rahimi-Recht (RF); the neural tangent kernel model of Jacot-Gabriel-Hongler (NT). Both these a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

6
100
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 57 publications
(106 citation statements)
references
References 12 publications
6
100
0
Order By: Relevance
“…On the one hand, Liang et al [27] prove asymptotic consistency for ground truth functions with asymptotically bounded Hilbert norms for neural tangent kernels (NTK) and inner product (IP) kernels. In contrast, Ghorbani et al [17,18] show that for uniform distributions on the product of two spheres, consistency cannot be achieved unless the ground truth is a low-degree polynomial. This polynomial approximation barrier can also be observed for random feature and neural tangent regression [17,31,32].…”
Section: Introductionmentioning
confidence: 97%
See 3 more Smart Citations
“…On the one hand, Liang et al [27] prove asymptotic consistency for ground truth functions with asymptotically bounded Hilbert norms for neural tangent kernels (NTK) and inner product (IP) kernels. In contrast, Ghorbani et al [17,18] show that for uniform distributions on the product of two spheres, consistency cannot be achieved unless the ground truth is a low-degree polynomial. This polynomial approximation barrier can also be observed for random feature and neural tangent regression [17,31,32].…”
Section: Introductionmentioning
confidence: 97%
“…In contrast, Ghorbani et al [17,18] show that for uniform distributions on the product of two spheres, consistency cannot be achieved unless the ground truth is a low-degree polynomial. This polynomial approximation barrier can also be observed for random feature and neural tangent regression [17,31,32].…”
Section: Introductionmentioning
confidence: 97%
See 2 more Smart Citations
“…Due to the extreme non-linearity of the networks in both the generator and the discriminator, it is highly unlikely that the training objective of GANs can be convex-concave. In particular, even if the generator and the discriminator are linear functions over prescribed feature mappings-such as the neural tangent kernel (NTK) feature mappings [3,8,9,17,18,32,35,40,41,47,51,54,65,69,92,97] -the training objective can still be non-convex-concave. 1 Even worse, unlike supervised learning where some non-convex learning problems can be shown to have no bad local minima [44], to the best of our knowledge, it still remains unclear what the qualities are of those critical points in GANs except in the most simple setting when the generator is a one-layer neural network [42,62].…”
Section: Introductionmentioning
confidence: 99%