2020
DOI: 10.48550/arxiv.2003.13839
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Model-Reference Reinforcement Learning Control of Autonomous Surface Vehicles with Uncertainties

Qingrui Zhang,
Wei Pan,
Vasso Reppa

Abstract: This paper presents a novel model-reference reinforcement learning control method for uncertain autonomous surface vehicles. The proposed control combines a conventional control method with deep reinforcement learning. With the conventional control, we can ensure the learning-based control law provides closed-loop stability for the overall system, and potentially increase the sample efficiency of the deep reinforcement learning. With the reinforcement learning, we can directly learn a control law to compensate… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
2

Relationship

5
1

Authors

Journals

citations
Cited by 6 publications
(14 citation statements)
references
References 30 publications
(62 reference statements)
0
14
0
Order By: Relevance
“…Note that both θ and φ are a set of parameters whose dimensions are determined by the deep neural network setup. Details on the design of Q θ (s t , u l,t ) and π φ (u l,t |s t ) can be found in [27]. The deep neural network for Q θ is called critic, while the one for π φ is called actor [27].…”
Section: Deep Reinforcement Learning Control Designmentioning
confidence: 99%
“…Note that both θ and φ are a set of parameters whose dimensions are determined by the deep neural network setup. Details on the design of Q θ (s t , u l,t ) and π φ (u l,t |s t ) can be found in [27]. The deep neural network for Q θ is called critic, while the one for π φ is called actor [27].…”
Section: Deep Reinforcement Learning Control Designmentioning
confidence: 99%
“…Lemma 2. (Policy improvement) Considering the last updated policy π old and the new policy π new to be obtained from (27), L πnew(k) ≤ L π old (k) holds for ∀x k ∈ S and ∀a k ∈ A.…”
Section: Algorithm Convergence Analysismentioning
confidence: 99%
“…Only until recently, the asymptotic stability in model-free RL is given for robotic control tasks [26]. In [27], [28], the stability of a system with a combination of a classic baseline controller and a RL controller is proved for autonomous surface vehicles with collisions.…”
Section: Introductionmentioning
confidence: 99%
“…Some of the work in this paper has been accepted to be presented in the 59th IEEE Conference on Decision and Control (CDC) that will be hosted at December, 2020. The online version of our CDC paper can be found in [48]. In our CDC paper, the collision avoidance problem is not addressed.…”
Section: Introductionmentioning
confidence: 99%