Asynchronous Distributed Reinforcement Learning for LQR Control via Zeroth-Order Block Coordinate Descent

Jing, Gangshan; Bai, He; George, Jemin; Chakrabortty, Aranya; Sharma, Piyush

doi:10.48550/arxiv.2107.12416

Search citation statements

Order By: Relevance

Paper Sections

Select...

Introduction1

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2023

Publication Types

Select...

Other1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

(1 citation statement)

References 38 publications

(71 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…where the stochastic component F (x; ξ), indexed by random variable ξ, is possibly nonconvex and nonsmooth. We focus on tackling the problem with Lipschitz continuous objective, which arises in many popular applications including simulation optimization [17,34], deep neural networks [4,15,33,48], statistical learning [11,31,49,50,52], reinforcement learning [5,21,30,41], financial risk minimization [40] and supply chain management [10]. The Clarke subdifferential [6] for Lipschitz continuous function is a natural extension of gradient for smooth function and subdifferential for convex function.…”

Section: Introductionmentioning

confidence: 99%

Faster Gradient-Free Algorithms for Nonsmooth Nonconvex Stochastic Optimization

Lesi¹,

Xu²

2023

Preprint

View full text Add to dashboard Cite

We consider the optimization problem of the form minwhere the component F (x; ξ) is L-mean-squared Lipschitz but possibly nonconvex and nonsmooth. The recently proposed gradient-free method requires at most O(L 4 d 3/2 ǫ −4 + ∆L 3 d 3/2 δ −1 ǫ −4 ) stochastic zeroth-order oracle complexity to find a (δ, ǫ)-Goldstein stationary point of objective function, whereand x0 is the initial point of the algorithm. This paper proposes a more efficient algorithm using stochastic recursive gradient estimator, which improves the complexity to O(L 3 d 3/2 ǫ −3 + ∆L 2 d 3/2 δ −1 ǫ −3 ).

show abstract