2021
DOI: 10.1109/tnnls.2021.3084685
|View full text |Cite
|
Sign up to set email alerts
|

Safe Reinforcement Learning With Stability Guarantee for Motion Planning of Autonomous Vehicles

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
13
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 72 publications
(22 citation statements)
references
References 28 publications
0
13
0
Order By: Relevance
“…The off-line gathered data with LiDAR and IMU sensors are required to update the deep neural networks. Similar collision prediction method is leveraged for safe navigation in [23]. However, collecting supervised data is arduous and it is challenging to generalize the prediction model in completely different scenarios because the off-line data are generally gathered from specific environments.…”
Section: Related Workmentioning
confidence: 99%
“…The off-line gathered data with LiDAR and IMU sensors are required to update the deep neural networks. Similar collision prediction method is leveraged for safe navigation in [23]. However, collecting supervised data is arduous and it is challenging to generalize the prediction model in completely different scenarios because the off-line data are generally gathered from specific environments.…”
Section: Related Workmentioning
confidence: 99%
“…In the robotics domain, multiple approaches exist for increasing the reliability of such systems [51], [66], [75]; however, these methods are mostly heuristical in nature [1], [20], [43]. To date, existing techniques for improving the safety of robotic systems rely mostly on Lagrangian multipliers [39], [53], [57], and do not provide formal safety guarantees, but rather optimize the training in an attempt to learn the required policies [12].…”
Section: Related Workmentioning
confidence: 99%
“…Only limited research has been conducted on risk-aware AV controllers. A recent study (L. Zhang et al, 2021) used the soft actor-critic (SAC) algorithm and control theorybased Lyapunov functions to train an agent subject to safety constraints. The authors in (Wen, Duan, Li, Xu, & Peng, 2020) proposed Parallel Constrained Policy Optimization (PCPO), which consists of three neural networks to estimate the policy function, value function, and a risk function.…”
Section: Risk-aware Autonomous Systemsmentioning
confidence: 99%