Edoardo Bacci scite author profile

Edoardo Bacci

5Publications

30Citation Statements Received

61Citation Statements Given

How they've been cited

How they cite others

100

Affiliations

University of Birmingham

Publications

Order By: Most citations

Probabilistic Guarantees for Safe Deep Reinforcement Learning

Bacci

Parker

2020

View full text Add to dashboard Cite

Deep reinforcement learning is an increasingly popular technique for synthesising policies to control an agent's interaction with its environment. There is also growing interest in formally verifying that such policies are correct and execute safely. Progress has been made in this area by building on existing work for verification of deep neural networks and of continuous-state dynamical systems. In this paper, we tackle the problem of verifying probabilistic policies for deep reinforcement learning, which are used to, for example, tackle adversarial environments, break symmetries and manage trade-offs. We propose an abstraction approach, based on interval Markov decision processes, that yields probabilistic guarantees on a policy's execution, and present techniques to build and solve these models using abstract interpretation, mixed-integer linear programming, entropy-based refinement and probabilistic model checking. We implement our approach and illustrate its effectiveness on a selection of reinforcement learning benchmarks.

show abstract

Verifying Reinforcement Learning up to Infinity

Bacci

Giacobbe

Parker

2021

View full text Add to dashboard Cite

Formally verifying that reinforcement learning systems act safely is increasingly important, but existing methods only verify over finite time. This is of limited use for dynamical systems that run indefinitely. We introduce the first method for verifying the time-unbounded safety of neural networks controlling dynamical systems. We develop a novel abstract interpretation method which, by constructing adaptable template-based polyhedra using MILP and interval arithmetic, yields sound---safe and invariant---overapproximations of the reach set. This provides stronger safety guarantees than previous time-bounded methods and shows whether the agent has generalised beyond the length of its training episodes. Our method supports ReLU activation functions and systems with linear, piecewise linear and non-linear dynamics defined with polynomial and transcendental functions. We demonstrate its efficacy on a range of benchmark control problems.

show abstract

Probabilistic Guarantees for Safe Deep Reinforcement Learning

Bacci¹,

Parker²

2020

Preprint

View full text Add to dashboard Cite

Deep reinforcement learning has been successfully applied to many control tasks, but the application of such agents in safety-critical scenarios has been limited due to safety concerns. Rigorous testing of these controllers is challenging, particularly when they operate in probabilistic environments due to, for example, hardware faults or noisy sensors. We propose MOSAIC, an algorithm for measuring the safety of deep reinforcement learning agents in stochastic settings. Our approach is based on the iterative construction of a formal abstraction of a controller's execution in an environment, and leverages probabilistic model checking of Markov decision processes to produce probabilistic guarantees on safe behaviour over a finite time horizon. It produces bounds on the probability of safe operation of the controller for different initial configurations and identifies regions where correct behaviour can be guaranteed. We implement and evaluate our approach on agents trained for several benchmark control problems.

show abstract

Verified Probabilistic Policies for Deep Reinforcement Learning

Bacci

Parker

2022

View full text Add to dashboard Cite

Verified Probabilistic Policies for Deep Reinforcement Learning

Bacci¹,

Parker²

2022

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Edoardo Bacci

Probabilistic Guarantees for Safe Deep Reinforcement Learning

Verifying Reinforcement Learning up to Infinity

Probabilistic Guarantees for Safe Deep Reinforcement Learning

Verified Probabilistic Policies for Deep Reinforcement Learning

Verified Probabilistic Policies for Deep Reinforcement Learning

Contact Info

Product

Resources

About