2022
DOI: 10.1007/978-3-031-22337-2_29
|View full text |Cite
|
Sign up to set email alerts
|

A Framework for Transforming Specifications in Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 24 publications
0
7
0
Order By: Relevance
“…Hence, if only the number of state/action pairs is allowed, alongside 1/ε and 1/δ, as parameters, creating a PAC learning algorithm for undiscounted, infinite-horizon properties is not possible. Specifically for LTL, this has been observed by Yang, Littman, and Carbin (2021) and Alur et al (2022). Example 1 (Intractability of LTL).…”
Section: Introductionmentioning
confidence: 74%
See 1 more Smart Citation
“…Hence, if only the number of state/action pairs is allowed, alongside 1/ε and 1/δ, as parameters, creating a PAC learning algorithm for undiscounted, infinite-horizon properties is not possible. Specifically for LTL, this has been observed by Yang, Littman, and Carbin (2021) and Alur et al (2022). Example 1 (Intractability of LTL).…”
Section: Introductionmentioning
confidence: 74%
“…Example 1 (Intractability of LTL). Figure 1 is an example adopted from (Alur et al 2022) that shows the number of samples required to learn safety properties is dependent on some property of the transition structure. The objective in this example is to stay in the initial state s 0 forever.…”
Section: Introductionmentioning
confidence: 99%
“…An additional approach to ensure safety in RL is through shielding, which intervenes in the agent's actions when it might violate safety constraints (Alshiekh et al 2018). Integrating formal methods, like temporal logic and Lyapunov-based techniques, into RL algorithms has emerged as a promising direction for safe RL (Hasanbeig, Abate, and Kroening 2018;Alur et al 2023;Chow et al 2018). STL Mining.…”
Section: Related Workmentioning
confidence: 99%
“…However, these analyses give reinforcement-learning algorithms for particular objectives and do not generalize to other objectives. Previous work (Alur et al 2021) gave a framework of reductions between objectives whose flavor of generality is most similar to our work; however, they did not give a condition for when an objective is PAC-learnable. To our knowledge, the PAC-learnability of the objectives in Sadigh et al (2014); Littman et al (2017); ; ; Camacho et al (2019); Jothimurugan, Alur, and Bastani (2019); are not known.…”
Section: Introductionmentioning
confidence: 99%