2004
DOI: 10.1023/b:joth.0000013559.37579.b2
|View full text |Cite
|
Sign up to set email alerts
|

A Stochastic Approximation Algorithm with Step-Size Adaptation

Abstract: We consider the following stochastic approximation algorithm of searching for the zero point x * of a function ϕ: x t+1 = x t − γ t y t , y t = ϕ(x t ) + ξ t , where y t are observations of ϕ and ξ t is the random noise. The step sizes γ t of the algorithm are random, the increment γ t+1 − γ t depending on γ t and on y t y t−1 in a rather general form. Generally, it is meant that γ t increases as y t y t−1 > 0, and decreases otherwise. It is proved that the algorithm converges to x * almost surely. This result… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
10
0

Year Published

2008
2008
2018
2018

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 18 publications
(10 citation statements)
references
References 7 publications
0
10
0
Order By: Relevance
“…The parameter β > 0 is fixed at the beginning of each run, as discussed below, and the SQN method is implemented as described in Algorithm 1. It is well known amongst the optimization and machine learning communities that the SGD method can be improved by choosing the parameter β via a set of problem dependent heuristics [19,27]. In some cases, β k (rather than β) is made to vary during the course of the iteration, and could even be chosen so that β k /k is constant, in which case only convergence to a neighborhood of the solution is guaranteed [15].…”
Section: Numerical Experimentsmentioning
confidence: 99%
“…The parameter β > 0 is fixed at the beginning of each run, as discussed below, and the SQN method is implemented as described in Algorithm 1. It is well known amongst the optimization and machine learning communities that the SGD method can be improved by choosing the parameter β via a set of problem dependent heuristics [19,27]. In some cases, β k (rather than β) is made to vary during the course of the iteration, and could even be chosen so that β k /k is constant, in which case only convergence to a neighborhood of the solution is guaranteed [15].…”
Section: Numerical Experimentsmentioning
confidence: 99%
“…Our multi‐level step‐size adaptation idea is inspired by Plakhov and Cruz [28] and Klein et al . [29]: if two consecutive gradients normal∇scriptFt1 and normal∇scriptFt are in the same direction, i.e.…”
Section: Adaptive Sgd On the Grassmannianmentioning
confidence: 99%
“…The most important parameter of GASG21 is μ max which controls how fast GASG21 goes to next level . GASG21 does not generate the actual step‐size η j by the adaptive step‐size framework [28, 29], rather the adaptive step‐size framework is used to generate the important sequence μ j . According to Klein et al .…”
Section: Adaptive Sgd On the Grassmannianmentioning
confidence: 99%
See 1 more Smart Citation
“…If the gradients point in opposite directions, the step size is reduced. The theoretical convergence properties of the method in one-dimensional (P = 1) optimisation problems were studied by Plakhov and Cruz (2004). Cruz (2005a) extended the analysis to multidimensional (P > 1) problems.…”
Section: (μ) ≡ C(f M • T )mentioning
confidence: 99%