2021
DOI: 10.1109/lcsys.2020.3006256
|View full text |Cite
|
Sign up to set email alerts
|

On the Linear Convergence of Random Search for Discrete-Time LQR

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
50
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 41 publications
(50 citation statements)
references
References 7 publications
0
50
0
Order By: Relevance
“…We apply a two-point gradient estimation for ∇J(K), as it yields better sample complexity than the one-point setting. Motivated by Mohammadi et al (2020), we show that with a large probability, (3) converges linearly using only log(1/ ) simulation time and total sampled trajectories with a desired accuracy of the cost, which significantly improves the polynomial sample complexity in Perdomo et al (2021). By letting = J − J * γ i+1 , we have the following result.…”
Section: Resultsmentioning
confidence: 80%
See 2 more Smart Citations
“…We apply a two-point gradient estimation for ∇J(K), as it yields better sample complexity than the one-point setting. Motivated by Mohammadi et al (2020), we show that with a large probability, (3) converges linearly using only log(1/ ) simulation time and total sampled trajectories with a desired accuracy of the cost, which significantly improves the polynomial sample complexity in Perdomo et al (2021). By letting = J − J * γ i+1 , we have the following result.…”
Section: Resultsmentioning
confidence: 80%
“…with ∇J γ (K k ) estimated by Algorithm 2. The result (Mohammadi et al, 2020, Theorem 1) can be restated as follows. Lemma 11 significantly improves from the existing literature in both the simulation time and the number of sampled trajectories.…”
Section: Discussionmentioning
confidence: 96%
See 1 more Smart Citation
“…Prior work: The majority of data-driven control approaches have focused on providing solutions to the linear quadratic regulator (LQR) problem. Among these works, we focus on the direct methods that do not require a system identification step (Hjalmarsson et al, 1998;Fazel et al, 2018;Mohammadi et al, 2020;Bradtke et al, 1994;De Persis and Tesi, 2019;Trentelman et al, 2020). Specifically, we highlight the work in De Persis and Tesi (2019), which applies behavioral systems theory to parametrize systems from past trajectories.…”
Section: Introductionmentioning
confidence: 99%
“…This inspires an increasing interest in understanding the performance of policy-based RL algorithms on simplified linear control benchmarks. For standard linear quadratic control problems, policy-based RL methods have been proved to yield strong convergence guarantees in various settings [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21]. For robust/risk-sensitive control problems, the robust adversarial reinforcement learning (RARL) framework appears to be quite relevant.…”
Section: Introductionmentioning
confidence: 99%