2019
DOI: 10.48550/arxiv.1909.00843
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Simple and optimal high-probability bounds for strongly-convex stochastic gradient descent

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 0 publications
0
7
0
Order By: Relevance
“…Convergence results are derived for an empirical risk that is smooth and satisfies the PL inequality, and under a bounded variation property (related to the so-called volatility [5]). As explained in Section 1.2, prior works provided high-probability convergence bounds under the assumption of Lipschitz continuity and strong convexity of the loss function [6,7,8,9]. In this paper, we prove convergence bounds matching the optimal bounds of [8], but without strong convexity.…”
Section: Introductionmentioning
confidence: 76%
See 1 more Smart Citation
“…Convergence results are derived for an empirical risk that is smooth and satisfies the PL inequality, and under a bounded variation property (related to the so-called volatility [5]). As explained in Section 1.2, prior works provided high-probability convergence bounds under the assumption of Lipschitz continuity and strong convexity of the loss function [6,7,8,9]. In this paper, we prove convergence bounds matching the optimal bounds of [8], but without strong convexity.…”
Section: Introductionmentioning
confidence: 76%
“…Regarding convergence of SGD, other papers prove high probability convergence bounds for projected gradient descent assuming Lipschitz continuity and strong convexity [6,7,8,9]. A function cannot be globally Lipschitz continuous and strongly convex (unless it is essentially a constant function), which is why they consider projected gradient.…”
Section: Prior Work and Contributionsmentioning
confidence: 99%
“…The lemma was proposed by Klein and Young [2015] (see lemma 4) to show the tightness of the Chernoff bound. Recently it has been used to derive a high probability lower bound of the 1/t step-size [Harvey et al, 2019b]. We will now use Lemma 7.2 to prove the following high probability lower bound of the step decay scheme in Algorithm 2 for S = T / log α T .…”
Section: Proofs Of Sectionmentioning
confidence: 99%
“…Kakade & Tewari (2009) used Freeman's inequality to prove high probability bounds for an algorithm solving the SVM objective function. For classic SGD, Harvey et al (2019b) and Harvey et al (2019a) used a generalized Freedmans inequality to prove bounds in non-smooth and strongly convex case, while Jain et al (2019) proved the optimal bound for the last iterate of SGD with high probability. As far as we know, there are currently no high probability bounds for adaptive methods in the nonconvex setting.…”
Section: Related Workmentioning
confidence: 99%