Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence 2021
DOI: 10.24963/ijcai.2021/297
|View full text |Cite
|
Sign up to set email alerts
|

Verifying Reinforcement Learning up to Infinity

Abstract: Formally verifying that reinforcement learning systems act safely is increasingly important, but existing methods only verify over finite time. This is of limited use for dynamical systems that run indefinitely. We introduce the first method for verifying the time-unbounded safety of neural networks controlling dynamical systems. We develop a novel abstract interpretation method which, by constructing adaptable template-based polyhedra using MILP and interval arithmetic, yields sound---safe and invaria… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 14 publications
(10 citation statements)
references
References 0 publications
0
10
0
Order By: Relevance
“…(ii) Adaptive cruise control [2]: The problem has two vehicles i ∈ {lead , ego}, whose state is determined by variables x i and v i for the position and speed of each car, respectively. The lead car proceeds at constant speed (28 m s −1 ), and the agent controls the acceleration (±1 m s −2 ) of ego using two actions.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…(ii) Adaptive cruise control [2]: The problem has two vehicles i ∈ {lead , ego}, whose state is determined by variables x i and v i for the position and speed of each car, respectively. The lead car proceeds at constant speed (28 m s −1 ), and the agent controls the acceleration (±1 m s −2 ) of ego using two actions.…”
Section: Methodsmentioning
confidence: 99%
“…Formal verification of RL, but in a non-probabilistic setting includes: [5], which extracts and analyses decision trees; [27], which checks safety and liveness properties for deep RL; and [2], which also uses template polyhedra and MILP to build abstractions, but to check (non-probabilistic) safety invariants.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…To date, the DNN verification community has focused primarily on feed-forward DNNs [21], [27], [30], [41], [68]. Some work has been carried out on verifying DRL networks, which pose greater challenges: beyond the general scalability challenges of DNNs verification, in DRL verification we must also take into account that agents typically interact with a reactive environment [5], [9], [13], [17]. In particular, these agents are invoked multiple times, and the inputs of each invocation are usually affected by the outputs of the previous invocations.…”
Section: Introductionmentioning
confidence: 99%