2014
DOI: 10.1137/140961791
|View full text |Cite
|
Sign up to set email alerts
|

A Proximal Stochastic Gradient Method with Progressive Variance Reduction

Abstract: We consider the problem of minimizing the sum of two convex functions: one is the average of a large number of smooth component functions, and the other is a general convex function that admits a simple proximal mapping. We assume the whole objective function is strongly convex. Such problems often arise in machine learning, known as regularized empirical risk minimization. We propose and analyze a new proximal stochastic gradient method, which uses a multi-stage scheme to progressively reduce the variance of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

11
781
0

Year Published

2014
2014
2023
2023

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 524 publications
(792 citation statements)
references
References 15 publications
11
781
0
Order By: Relevance
“…Remark 1: Algorithm 2 and its analysis is slightly different than the delayed incremental gradient method that we presented in [9]. In the notation of this paper, our earlier algorithm would have a convergence factor on the same form as (8) but [10], it follows that our guaranteed upper bound improves upon the one in [9], especially for applications where the component functions vary substantially in smoothness.…”
Section: Efficiency Comparison With Asynchronous Incremental Gramentioning
confidence: 95%
“…Remark 1: Algorithm 2 and its analysis is slightly different than the delayed incremental gradient method that we presented in [9]. In the notation of this paper, our earlier algorithm would have a convergence factor on the same form as (8) but [10], it follows that our guaranteed upper bound improves upon the one in [9], especially for applications where the component functions vary substantially in smoothness.…”
Section: Efficiency Comparison With Asynchronous Incremental Gramentioning
confidence: 95%
“…7 in base solution to the following Eq. 10 by appending with a proximal operator [22] to obtain the sparse solutions for the linear coefficients w j,l j of each relevant prop j as …”
Section: Sparse Prediction Solution: Sp-mimlmentioning
confidence: 99%
“…We first restate some useful results from [10]. Further let F (x) and R(x) have convexity parameters µ F and µ R , respectively.…”
Section: Proof Of Theoremmentioning
confidence: 99%
“…We now proceed to prove Theorem 1. For brevity, we omit some details that are identical to those in the proof of Theorem 3.1 in [10]. We have indicated these omissions below.…”
Section: Proof Of Theoremmentioning
confidence: 99%
See 1 more Smart Citation