Safe Reinforcement Learning for Autonomous Lane Changing Using Set-Based Prediction

Krasowski, Hanna; Wang, Xiao; Althoff, Matthias

doi:10.1109/itsc45102.2020.9294259

Cited by 44 publications

(18 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One can also use prior knowledge as an inductive bias for the exploration process [4], [15]; for example, one can provide a finite set of demonstrations as guidance on the task [16]. Although these approaches can provide strong safety guarantees, most of them assume prior knowledge on some or all components of the system model [17], [18], [19], which is not always feasible for more complicated systems. In addition, some techniques in this category also suffer the curse of dimensionality [18], [20].…”

Section: A Related Workmentioning

confidence: 99%

Safe Reinforcement Learning Using Black-Box Reachability Analysis

Selim¹,

Alanwar²,

Kousik³

et al. 2022

Preprint

View full text Add to dashboard Cite

Reinforcement learning (RL) is capable of sophisticated motion planning and control for robots in uncertain environments. However, state-of-the-art deep RL approaches typically lack safety guarantees, especially when the robot and environment models are unknown. To justify widespread deployment, robots must respect safety constraints without sacrificing performance. Thus, we propose a Black-box Reachabilitybased Safety Layer (BRSL) with three main components:(1) data-driven reachability analysis for a black-box robot model, (2) a trajectory rollout planner that predicts future actions and observations using an ensemble of neural networks trained online, and (3) a differentiable polytope collision check between the reachable set and obstacles that enables correcting unsafe actions. In simulation, BRSL outperforms other stateof-the-art safe RL methods on a Turtlebot 3, a quadrotor, and a trajectory-tracking point mass with an unsafe set adjacent to the area of highest reward.

show abstract

Section: A Related Workmentioning

confidence: 99%

Safe Reinforcement Learning Using Black-Box Reachability Analysis

Selim¹,

Alanwar²,

Kousik³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…For competitive scenarios like autonomous lane change or lane merge, both model-based methods [1] and learningbased methods [2] have been demonstrated to generate the ego vehicle's desired trajectory. Similarly, control using model-based methods [3], [4] and learning-based methods [5] have also been developed. However, the criteria to evaluate planning and control performance are different for car racing compared to autonomous driving on public roads.…”

Section: B Related Workmentioning

confidence: 99%

Autonomous Racing with Multiple Vehicles using a Parallelized Optimization with Safety Guarantee using Control Barrier Functions

He¹,

Zeng²,

Sreenath³

2021

Preprint

View full text Add to dashboard Cite

This paper presents a novel planning and control strategy for competing with multiple vehicles in a car racing scenario. The proposed racing strategy switches between two modes. When there are no surrounding vehicles, a learningbased model predictive control (MPC) trajectory planner is used to guarantee that the ego vehicle achieves better lap timing. When the ego vehicle is competing with other surrounding vehicles to overtake, an optimization-based planner generates multiple dynamically-feasible trajectories through parallel computation. Each trajectory is optimized under a MPC formulation with different homotopic Bezier-curve reference paths lying laterally between surrounding vehicles. The time-optimal trajectory among these different homotopic trajectories is selected and a low-level MPC controller with obstacle avoidance constraints is used to guarantee system safety-critical performance. The proposed algorithm has the capability to generate collision-free trajectories and track them while enhancing the lap timing performance with steady low computational complexity, outperforming existing approaches in both timing and performance for a car racing environment. To demonstrate the performance of our racing strategy, we simulate with multiple randomly generated moving vehicles on the track and test the ego vehicle's overtake maneuvers.

show abstract

“…In addition to constrained MDP-based approaches, there exist works that improve safety by generating more samples in the risky region to bootstrap performance in critical scenarios [18]; by using a safety layer at the end of a deep neural network to verify the safety of the resulting policy and replacing with a backup safe action if needed [19]; by proposing a reachability-based trajectory safe guard to ensure the safety of a policy [20], etc. In this paper, we focus on generating risk-bounded policies directly by modeling the risk as an explicit constraint in the objective function.…”

Section: B Safe Reinforcement Learningmentioning

confidence: 99%

Risk Conditioned Neural Motion Planning

Huang

Feng²,

Jasour³

et al. 2021

Preprint

View full text Add to dashboard Cite

Risk-bounded motion planning is an important yet difficult problem for safety-critical tasks. While existing mathematical programming methods offer theoretical guarantees in the context of constrained Markov decision processes, they either lack scalability in solving larger problems or produce conservative plans. Recent advances in deep reinforcement learning improve scalability by learning policy networks as function approximators. In this paper, we propose an extension of soft actor critic model to estimate the execution risk of a plan through a risk critic and produce risk-bounded policies efficiently by adding an extra risk term in the loss function of the policy network. We define the execution risk in an accurate form, as opposed to approximating it through a summation of immediate risks at each time step that leads to conservative plans. Our proposed model is conditioned on a continuous spectrum of risk bounds, allowing the user to adjust the risk-averse level of the agent on the fly. Through a set of experiments, we show the advantage of our model in terms of both computational time and plan quality, compared to a state-of-the-art mathematical programming baseline, and validate its performance in more complicated scenarios, including nonlinear dynamics and larger state space.

show abstract

Safe Reinforcement Learning for Autonomous Lane Changing Using Set-Based Prediction

Cited by 44 publications

References 20 publications

Safe Reinforcement Learning Using Black-Box Reachability Analysis

Safe Reinforcement Learning Using Black-Box Reachability Analysis

Autonomous Racing with Multiple Vehicles using a Parallelized Optimization with Safety Guarantee using Control Barrier Functions

Risk Conditioned Neural Motion Planning

Contact Info

Product

Resources

About