2019 IEEE 58th Conference on Decision and Control (CDC) 2019
DOI: 10.1109/cdc40024.2019.9028919
|View full text |Cite
|
Sign up to set email alerts
|

Reinforcement Learning for Temporal Logic Control Synthesis with Probabilistic Satisfaction Guarantees

Abstract: Reinforcement Learning (RL) has emerged as an efficient method of choice for solving complex sequential decision making problems in automatic control, computer science, economics, and biology. In this paper we present a model-free RL algorithm to synthesize control policies that maximize the probability of satisfying high-level control objectives given as Linear Temporal Logic (LTL) formulas. Uncertainty is considered in the workspace properties, the structure of the workspace, and the agent actions, giving ri… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
80
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 90 publications
(80 citation statements)
references
References 32 publications
0
80
0
Order By: Relevance
“…Any LTL formula ϕ can be converted into various ωautomata, namely finite state machines that recognize all infinite words satisfying ϕ. We review a generalized Büchi automaton at the beginning, and then introduce a limitdeterministic generalized Büchi automaton [10].…”
Section: B Linear Temporal Logic and Automatamentioning
confidence: 99%
See 4 more Smart Citations
“…Any LTL formula ϕ can be converted into various ωautomata, namely finite state machines that recognize all infinite words satisfying ϕ. We review a generalized Büchi automaton at the beginning, and then introduce a limitdeterministic generalized Büchi automaton [10].…”
Section: B Linear Temporal Logic and Automatamentioning
confidence: 99%
“…Through the above scenario, we compare our approach with 1) a case where we first convert the tLDGBA into a tLDBA, for which the augmentation makes no change, and thus a reward function in Definition 10 is based on a single accepting set; and 2) the method using a reward function based on the accepting frontier function [9], [10]. For the three methods, we use Qlearning 1 with an epsilon-greedy policy.…”
Section: Examplementioning
confidence: 99%
See 3 more Smart Citations