2022
DOI: 10.48550/arxiv.2202.13361
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Benign Underfitting of Stochastic Gradient Descent

Abstract: We study to what extent may stochastic gradient descent (SGD) be understood as a "conventional" learning rule that achieves generalization performance by obtaining a good fit to training data. We consider the fundamental stochastic convex optimization framework, where (one pass, without -replacement) SGD is classically known to minimize the population risk at rate O(1/ √ n), and prove that, surprisingly, there exist problem instances where the SGD solution exhibits both empirical risk and generalization gap of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

1
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 10 publications
1
1
0
Order By: Relevance
“…Together with Theorem 2.1 we obtain the following corollary We stress again, that, in contrast with these results, an optimization algorithm can correctly minimize the true loss using no more than Õ(1/ε 2 ) iterations and Õ(1/ε 2 ) examples [17].…”
Section: Resultssupporting
confidence: 51%
See 1 more Smart Citation
“…Together with Theorem 2.1 we obtain the following corollary We stress again, that, in contrast with these results, an optimization algorithm can correctly minimize the true loss using no more than Õ(1/ε 2 ) iterations and Õ(1/ε 2 ) examples [17].…”
Section: Resultssupporting
confidence: 51%
“…It is also worth mentioning that recently Koren et al [17] showed that, in adaptive data analysis terminology, SGD is an example to a non post-hoc generalizing algorithm in the following sense: It can be shown that the output parameter w S provided by SGD may minimize the population loss, but there is a constant gap between the empirical and population loss at w S .…”
Section: Stochastic Convex Optimizationmentioning
confidence: 99%