Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning

Brunke, Lukas; Greeff, Melissa; Hall, Adam W.; Yuan, Zhaocong; Zhou, Siqi; Panerati, Jacopo; Schoellig, Angela P.

doi:10.48550/arxiv.2108.06266

Cited by 10 publications

(14 citation statements)

References 96 publications

(179 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Embedding such structure would facilitate faster training and potentially better generalization. Another exciting future direction is to augment our approach with methods from safe reinforcement learning [56] to provide generalization guarantees on the success of safety-critical systems while ensuring safety during the training process itself (in contrast to only providing guarantees for deployment, as we do here). Finally, we are working towards hardware experiments of our approach on a drone navigating an obstacle field under wind disturbances.…”

Section: Discussionmentioning

confidence: 99%

Learning Provably Robust Motion Planners Using Funnel Libraries

Gurgen¹,

Majumdar²,

Veer³

2021

Preprint

View full text Add to dashboard Cite

This paper presents an approach for learning motion planners that are accompanied with probabilistic guarantees of success on new environments that hold uniformly for any disturbance to the robot's dynamics within an admissible set. We achieve this by bringing together tools from generalization theory and robust control. First, we curate a library of motion primitives where the robustness of each primitive is characterized by an over-approximation of the forward reachable set, i.e., a "funnel". Then, we optimize probably approximately correct (PAC)-Bayes generalization bounds for training our planner to compose these primitives such that the entire funnels respect the problem specification. We demonstrate the ability of our approach to provide strong guarantees on two simulated examples: (i) navigation of an autonomous vehicle under external disturbances on a five-lane highway with multiple vehicles, and (ii) navigation of a drone across an obstacle field in the presence of wind disturbances.

show abstract

Section: Discussionmentioning

confidence: 99%

Learning Provably Robust Motion Planners Using Funnel Libraries

Gurgen¹,

Majumdar²,

Veer³

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Safety. We refer to [81] for a detailed review of learning-based control. Essentially, safety constraints can be embedded in Problem 2 in three ways, which in turn correspond to different safety levels [81].…”

Section: Commentsmentioning

confidence: 99%

Probabilistic design of optimal sequential decision-making algorithms in learning and control

Emiland¹,

Russo²

2022

Preprint

View full text Add to dashboard Cite

This survey is focused on certain sequential decision-making problems that involve optimizing over probability functions. We discuss the relevance of these problems for learning and control. The survey is organized around a framework that combines a problem formulation and a set of resolution methods. The formulation consists of an infinite-dimensional optimization problem. The methods come from approaches to search optimal solutions in the space of probability functions. Through the lenses of this overarching framework we revisit popular learning and control algorithms, showing that these naturally arise from suitable variations on the formulation mixed with different resolution methods. A running example, for which we make the code available, complements the survey. Finally, a number of challenges arising from the survey are also outlined.

show abstract

“…Safe RL Our work is broadly related to the safe reinforcement learning and control literature; we refer interested readers to (Garcıa and Fernández 2015;Brunke et al 2021) for surveys on this topic. A popular class of approaches incorporates Lagrangian constraint regularization into the policy updates in policy-gradient algorithms (Achiam et al 2017;Ray, Achiam, and Amodei 2019;Tessler, Mankowitz, and Mannor 2018;Dalal et al 2018;Cheng et al 2019;Zhang, Vuong, and Ross 2020;Chow et al 2019).…”

Section: Related Workmentioning

confidence: 99%

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Ma¹,

Shen²,

Bastani³

et al. 2021

Preprint

View full text Add to dashboard Cite

Reinforcement Learning (RL) agents in the real world must satisfy safety constraints in addition to maximizing a reward objective. Model-based RL algorithms hold promise for reducing unsafe real-world actions: they may synthesize policies that obey all constraints using simulated samples from a learned model. However, imperfect models can result in realworld constraint violations even for actions that are predicted to satisfy all constraints. We propose Conservative and Adaptive Penalty (CAP), a model-based safe RL framework that accounts for potential modeling errors by capturing model uncertainty and adaptively exploiting it to balance the reward and the cost objectives. First, CAP inflates predicted costs using an uncertainty-based penalty. Theoretically, we show that policies that satisfy this conservative cost constraint are guaranteed to also be feasible in the true environment. We further show that this guarantees the safety of all intermediate solutions during RL training. Further, CAP adaptively tunes this penalty during training using true cost feedback from the environment. We evaluate this conservative and adaptive penalty-based approach for model-based safe RL extensively on state and image-based environments. Our results demonstrate substantial gains in sample-efficiency while incurring fewer violations than prior safe RL algorithms. Code is available at: https://github.com/Redrew/CAP

show abstract

Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning

Cited by 10 publications

References 96 publications

Learning Provably Robust Motion Planners Using Funnel Libraries

Learning Provably Robust Motion Planners Using Funnel Libraries

Probabilistic design of optimal sequential decision-making algorithms in learning and control

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Contact Info

Product

Resources

About