Model-Reference Reinforcement Learning for Collision-Free Tracking Control of Autonomous Surface Vehicles

Zhang, Qingrui; Pan, Wei; Reppa, Vasso

doi:10.48550/arxiv.2008.07240

Cited by 4 publications

(5 citation statements)

References 56 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Motivated by the works in [20], [25], [26], [27], we propose to incorporate the theoretical result in Theorem 1 to formulate a constrained optimisation problem, based on SAC [17]. First of all, a Lyapunov candidate needs to be selected at the first instance.…”

Section: Lyapunov-based Reinforcement Learningmentioning

confidence: 99%

Reinforcement Learning for Orientation Estimation Using Inertial Sensors with Performance Guarantee

Hu¹,

Tang²,

Zhou³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

This paper presents a deep reinforcement learning (DRL) algorithm for orientation estimation using inertial sensors combined with magnetometer. The Lyapunov's method in control theory is employed to prove the convergence of orientation estimation errors. Based on the theoretical results, the estimator gains and a Lyapunov function are parametrised by deep neural networks and learned from samples. The DRL estimator is compared with three well-known orientation estimation methods on both numerical simulations and real dataset collected from commercially available sensors. The results show that the proposed algorithm is superior for arbitrary estimation initialisation and can adapt to very large angular velocities for which other algorithms can be hardly applicable. To the best of our knowledge, this is the first DRL-based orientation estimation method with estimation error boundedness guarantee.

show abstract

Section: Lyapunov-based Reinforcement Learningmentioning

confidence: 99%

Reinforcement Learning for Orientation Estimation Using Inertial Sensors with Performance Guarantee

Hu¹,

Tang²,

Zhou³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…The convergence guarantee can be validated by checking the values of Lagrange multipliers. When the Lyapunov constraint in (28) is satisfied, the parameter λ should continuously decrease to zero. In Fig.…”

Section: A Algorithm Convergencementioning

confidence: 99%

“…Only until recently, the asymptotic stability in model-free RL is given for robotic control tasks [26]. In [27], [28], the stability of a system with a combination of a classic baseline controller and a RL controller is proved for autonomous surface vehicles with collisions.…”

Section: Introductionmentioning

confidence: 99%

Lyapunov-Based Reinforcement Learning State Estimator

Hu¹,

Wu²,

Pan³

2020

Preprint

Self Cite

View full text Add to dashboard Cite

In this paper, we consider the state estimation problem for nonlinear stochastic discrete-time systems. We combine Lyapunov's method in control theory and deep reinforcement learning to design the state estimator. We theoretically prove the convergence of the bounded estimate error solely using the data simulated from the model. An actor-critic reinforcement learning algorithm is proposed to learn the state estimator approximated by a deep neural network. The convergence of the algorithm is analysed. The proposed Lyapunov-based reinforcement learning state estimator is compared with a number of existing nonlinear filtering methods through Monte Carlo simulations, showing its advantage in terms of estimate convergence even under some system uncertainties such as covariance shift in system noise and randomly missing measurements. To the best of our knowledge, this is the first reinforcement learning based nonlinear state estimator with bounded estimate error performance guarantee.

show abstract

“…But it is used to guarantee the safety of the agent in the training instead of the stability. To guarantee stability, Zhang et al in Zhang, Dong and Pan (2020), Zhang, Pan and Reppa (2020) respectively proposed basic control based SAC algorithms for ships and multi-agents. Furthermore, Lyapunov-based soft actor-critic algorithms have been proposed for traditional control and estimation design in our previous work, in which the stability has been proved by solely using data (Han, Zhang, Wang, & Pan, 2020;Hu, Wu, & Pan, 2020) and a learned Lyapunov function constraint.…”

Section: Introductionmentioning

confidence: 99%