2019
DOI: 10.48550/arxiv.1903.07228
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Optimal Rate of Convergence for Quasi-Stochastic Approximation

Abstract: The Robbins-Monro stochastic approximation algorithm is a foundation of many algorithmic frameworks for reinforcement learning (RL), and often an efficient approach to solving (or approximating the solution to) complex optimal control problems. However, in many cases practitioners are unable to apply these techniques because of an inherent high variance. This paper aims to provide a general foundation for "quasistochastic approximation," in which all of the processes under consideration are deterministic, much… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
20
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
3

Relationship

3
0

Authors

Journals

citations
Cited by 3 publications
(20 citation statements)
references
References 23 publications
0
20
0
Order By: Relevance
“…While the stylized algorithm (4) and approximation ( 6) is used here for an illustration of the main ideas, the paper considers a more general convex constrained optimization framework, and develops model-free primal-dual methods to track the optimal trajectories. Using a deterministic exploration approach reminiscent to the quasi-stochastic approximation method [5], this paper provides design principles for the exploration signal ξ, as well as other algorithmic parameters, to ensure stability and tracking guarantees. In particular, we show that under some conditions, the iterates x (k) converge within a ball around the optimal solution of (2).…”
Section: Arxiv:190913132v1 [Mathoc] 28 Sep 2019mentioning
confidence: 99%
See 4 more Smart Citations
“…While the stylized algorithm (4) and approximation ( 6) is used here for an illustration of the main ideas, the paper considers a more general convex constrained optimization framework, and develops model-free primal-dual methods to track the optimal trajectories. Using a deterministic exploration approach reminiscent to the quasi-stochastic approximation method [5], this paper provides design principles for the exploration signal ξ, as well as other algorithmic parameters, to ensure stability and tracking guarantees. In particular, we show that under some conditions, the iterates x (k) converge within a ball around the optimal solution of (2).…”
Section: Arxiv:190913132v1 [Mathoc] 28 Sep 2019mentioning
confidence: 99%
“…The paper then develops distributed algorithms based on the zero-order approximation of the method of multipliers. In contrast to our paper, [21] considers a stochastic exploration signal for the gradient estimation, and typically requires N > 2 function evaluations to reduce the estimation variance [21, Lemma 1]; see also [5] for the detailed analysis of the advantage of deterministic vs stochastic exploration. Moreover, it considers a static optimization problem.…”
Section: Literature Reviewmentioning
confidence: 99%
See 3 more Smart Citations