Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence 2019
DOI: 10.24963/ijcai.2019/768
|View full text |Cite
|
Sign up to set email alerts
|

Counterexample-Guided Strategy Improvement for POMDPs Using Recurrent Neural Networks

Abstract: We study strategy synthesis for partially observable Markov decision processes (POMDPs). The particular problem is to determine strategies that provably adhere to (probabilistic) temporal logic constraints. This problem is computationally intractable and theoretically hard. We propose a novel method that combines techniques from machine learning and formal verification. First, we train a recurrent neural network (RNN) to encode POMDP strategies. The RNN accounts for memory-based decisions without the need to e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
1

Relationship

2
4

Authors

Journals

citations
Cited by 24 publications
(22 citation statements)
references
References 3 publications
0
20
0
Order By: Relevance
“…Closest to the proposed method is [Carr et al, 2019], which introduced a verification-guided method to train RNNs as POMDP policies. In contrast to the proposed method, while polices are extracted from the RNNs, these policies do not directly exhibit the memory structure of the RNNs and are instead handcrafted based on knowledge about the particular application.…”
Section: Related Workmentioning
confidence: 99%
See 4 more Smart Citations
“…Closest to the proposed method is [Carr et al, 2019], which introduced a verification-guided method to train RNNs as POMDP policies. In contrast to the proposed method, while polices are extracted from the RNNs, these policies do not directly exhibit the memory structure of the RNNs and are instead handcrafted based on knowledge about the particular application.…”
Section: Related Workmentioning
confidence: 99%
“…For that, we use diagnostic information in the form of counterexamples to generate new data [Carr et al, 2019].…”
Section: Outlinementioning
confidence: 99%
See 3 more Smart Citations