Deep Reinforcement Learning with Temporal Logics

Hasanbeig, Mohammadhosein; Kroening, Daniel; Abate, Alessandro

doi:10.1007/978-3-030-57628-8_1

Cited by 36 publications

(34 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Figure 11(c) shows the average and the maximum number of steps required to terminate for all the engines with every specification across 100 executions in logarithmic scale. The number of steps is a known measure used to compare RL methods logically constrained with LTL formulae [40‐43]. Known RL‐LTL methods take a high number of steps, in the order of hundreds of thousands, because these methods aim to converge to an optimal policy.…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Functional test generation from UI test scenarios using reinforcement learning for android applications

Koroglu

Şen

2020

Software Testing Verif & Rel

View full text Add to dashboard Cite

Summary With the ever‐growing Android graphical user interface (GUI) application market, there have been many studies on automated test generation for Android GUI applications. These studies successfully demonstrate how to detect fatal exceptions and achieve high coverage with fully automated test generation engines. However, it is unclear how many GUI functions these engines manage to test. The current best practice for the functional testing of Android GUI applications is to design user interface (UI) test scenarios with a non‐technical and human‐readable language such as Gherkin and implement Java/Kotlin methods for every statement of all the UI test scenarios. Writing tests for UI test scenarios is hard, especially when some scenario statements are high‐level and declarative, so it is not clear what actions should the generated test perform. We propose the Fully Automated Reinforcement LEArning‐Driven specification‐based test generator for Android (FARLEAD‐Android). FARLEAD‐Android first translates the UI test scenario to a GUI‐level formal specification as a linear‐time temporal logic (LTL) formula. The LTL formula guides the test generation and acts as a specified test oracle. By dynamically executing the application under test (AUT), and monitoring the LTL formula, FARLEAD‐Android learns how to produce a witness for the UI test scenario, using reinforcement learning (RL). Our evaluation shows that FARLEAD‐Android is more effective and achieves higher performance in generating tests for UI test scenarios than three known engines: Random, Monkey and QBEa. To the best of our knowledge, FARLEAD‐Android is the first fully automated mobile GUI testing engine that uses formal specifications.

show abstract

Section: Discussionmentioning

confidence: 99%

“…Several studies [40‐43] use LTL specifications as a high‐level guide for an RL agent. The RL agent in these studies never terminate and has to avoid violating a given specification indefinitely.…”

Section: Related Workmentioning

confidence: 99%

Functional test generation from UI test scenarios using reinforcement learning for android applications

Koroglu

Şen

2020

Software Testing Verif & Rel

View full text Add to dashboard Cite

show abstract

“…Our model extends Araki et al (2019b), which builds upon the Value Iteration Network (VIN) model (Tamar et al 2016) by applying a more structured variant of VIN to the product of a low-level MDP with a logical specification defined by an FSA. Other works incorporating logical structure into the imitation learning setting include Paxton et al (2017), Li, Ma, andBelta (2017), Hasanbeig, Abate, and Kroening (2018), Icarte et al (2018), Burke, Penkov, and Ramamoorthy (2019), and Gordon, Fox, and Farhadi (2019). These models assume that at least part of the logic specification is known, and they are not interpretable and manipulable.…”

Section: Related Workmentioning

confidence: 99%

Deep Bayesian Nonparametric Learning of Rules and Plans from Demonstrations with a Learned Automaton Prior

Araki

Vodrahalli

Leech

et al. 2020

AAAI

View full text Add to dashboard Cite

We introduce a method to learn imitative policies from expert demonstrations that are interpretable and manipulable. We achieve interpretability by modeling the interactions between high-level actions as an automaton with connections to formal logic. We achieve manipulability by integrating this automaton into planning, so that changes to the automaton have predictable effects on the learned behavior. These qualities allow a human user to first understand what the model has learned, and then either correct the learned behavior or zero-shot generalize to new, similar tasks. We build upon previous work by no longer requiring additional supervised information which is hard to collect in practice. We achieve this by using a deep Bayesian nonparametric hierarchical model. We test our model on several domains and also show results for a real-world implementation on a mobile robotic arm platform.

show abstract

“…[32] uses LTL to define constraints on a Monte Carlo Tree Search. [28] and [18] use the product of an LTL-derived FSA with an MDP to make learning more efficient.…”

Section: Related Workmentioning

confidence: 99%

“…In [21] the authors use LTL to design a sub-task extraction procedure as part of a more standard deep reinforcement learning setup. However, these methods assume the LTL specifications are already known, and [32,33,21,18] do not allow for a model that is easy to interpret and manipulate. By contrast, our model only requires the current FSA state and the location of logic propositions in the environment.…”

Section: Related Workmentioning

confidence: 99%

Learning to Plan with Logical Automata

Araki¹,

Vodrahalli²,

Leech³

et al. 2019

Robotics: Science and Systems XV

View full text Add to dashboard Cite

This paper introduces the Logic-based Value Iteration Network (LVIN) framework, which combines imitation learning and logical automata to enable agents to learn complex behaviors from demonstrations. We address two problems with learning from expert knowledge: (1) how to generalize learned policies for a task to larger classes of tasks, and (2) how to account for erroneous demonstrations. Our LVIN model solves finite gridworld environments by instantiating a recurrent, convolutional neural network as a value iteration procedure over a learned Markov Decision Process (MDP) that factors into two MDPs: a small finite state automaton (FSA) corresponding to logical rules, and a larger MDP corresponding to motions in the environment. The parameters of LVIN (value function, reward map, FSA transitions, large MDP transitions) are approximately learned from expert trajectories. Since the model represents the learned rules as an FSA, the model is interpretable; since the FSA is integrated into planning, the behavior of the agent can be manipulated by modifying the FSA transitions. We demonstrate these abilities in several domains of interest, including a lunchboxpacking manipulation task and a driving domain.

show abstract

Deep Reinforcement Learning with Temporal Logics

Cited by 36 publications

References 30 publications

Functional test generation from UI test scenarios using reinforcement learning for android applications

Functional test generation from UI test scenarios using reinforcement learning for android applications

Deep Bayesian Nonparametric Learning of Rules and Plans from Demonstrations with a Learned Automaton Prior

Learning to Plan with Logical Automata

Contact Info

Product

Resources

About