Interpretable policies for reinforcement learning by genetic programming

Hein, Daniel; Udluft, Steffen; Runkler, Thomas A.

doi:10.1016/j.engappai.2018.09.007

Cited by 102 publications

(67 citation statements)

References 30 publications

(36 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Similarly, Hein et al [85] introduced another algebraic based method for RL interpretation but with using genetic algorithms. The authors collected data of four dimensions that include: current state, action, next state and the received reward, and then trained their algorithm for two Atari games:…”

Section: ) Algebraic Languagesmentioning

confidence: 99%

Reinforcement Learning Interpretation Methods: A Survey

2020

View full text Add to dashboard Cite

Reinforcement Learning (RL) systems achieved outstanding performance in different domains such as Atari games, finance and self-driving cars. However, their black-box nature complicates their use, especially in critical applications such as healthcare. To solve this problem, researchers have proposed different approaches to interpret RL models. Some of these methods were adopted from machine learning, while others were designed specifically for RL. The main objective of this paper is to show and explain RL interpretation methods, the metrics used to classify them, and how these metrics were applied to understand the internal details of RL models. We reviewed papers that propose new RL interpretation methods, improve the old ones, or discuss the pros and cons of the existing methods.

show abstract

Section: ) Algebraic Languagesmentioning

confidence: 99%

Reinforcement Learning Interpretation Methods: A Survey

2020

View full text Add to dashboard Cite

show abstract

“…The incorporation of the model-based reinforcement learning and the symbolic regression algorithm has been studied in [11]. Reference [11] introduced a method of evolving interpretable policies based on GP. A neural network that builds based on trajectory data is used as a world model to evaluate the fitness value of generated symbolic policies.…”

Section: Related Workmentioning

confidence: 99%

“…The idea is to evolve an explainable model to extract the policy from the deep neural network. In [11], an explainable reinforcement learning policy model is built by using the tree-based genetic programming (GP) [24] algorithm. However, it is argued in [11] that it is hard for GP to mimic the behavior of the deep neural network.…”

Section: Introductionmentioning

confidence: 99%

“…In [11], an explainable reinforcement learning policy model is built by using the tree-based genetic programming (GP) [24] algorithm. However, it is argued in [11] that it is hard for GP to mimic the behavior of the deep neural network. Therefore, in that paper, DNN is used as a surrogate model, and then GP is used to evolving strategy on that model.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Interpretable policy derivation for reinforcement learning based on evolutionary feature synthesis

Zhang

Zhou

Lin

2020

Complex Intell. Syst.

View full text Add to dashboard Cite

Reinforcement learning based on the deep neural network has attracted much attention and has been widely used in real-world applications. However, the black-box property limits its usage from applying in high-stake areas, such as manufacture and healthcare. To deal with this problem, some researchers resort to the interpretable control policy generation algorithm. The basic idea is to use an interpretable model, such as tree-based genetic programming, to extract policy from other black box modes, such as neural networks. Following this idea, in this paper, we try yet another form of the genetic programming technique, evolutionary feature synthesis, to extract control policy from the neural network. We also propose an evolutionary method to optimize the operator set of the control policy for each specific problem automatically. Moreover, a policy simplification strategy is also introduced. We conduct experiments on four reinforcement learning environments. The experiment results reveal that evolutionary feature synthesis can achieve better performance than tree-based genetic programming to extract policy from the neural network with comparable interpretability.

show abstract

“…Some works explain agent behavior, but require the agent to use a specific, interpretable model. For example, Genetic Programming for Reinforcement Learning (Hein, Udluft, and Runkler 2017) use a genetic algorithm to learn a policy which is inherently explainable. Unlike our method, this method is incompatible with arbitrary RL systems due to its reliance on learning inherently small policies using a genetic algorithm.…”

Section: Introductionmentioning

confidence: 99%

Generation of Policy-Level Explanations for Reinforcement Learning

Topin

Veloso

2019

AAAI

View full text Add to dashboard Cite

Though reinforcement learning has greatly benefited from the incorporation of neural networks, the inability to verify the correctness of such systems limits their use. Current work in explainable deep learning focuses on explaining only a single decision in terms of input features, making it unsuitable for explaining a sequence of decisions. To address this need, we introduce Abstracted Policy Graphs, which are Markov chains of abstract states. This representation concisely summarizes a policy so that individual decisions can be explained in the context of expected future transitions. Additionally, we propose a method to generate these Abstracted Policy Graphs for deterministic policies given a learned value function and a set of observed transitions, potentially off-policy transitions used during training. Since no restrictions are placed on how the value function is generated, our method is compatible with many existing reinforcement learning methods. We prove that the worst-case time complexity of our method is quadratic in the number of features and linear in the number of provided transitions, O(|F | 2 |tr samples|). By applying our method to a family of domains, we show that our method scales well in practice and produces Abstracted Policy Graphs which reliably capture relationships within these domains.

show abstract

Interpretable policies for reinforcement learning by genetic programming

Cited by 102 publications

References 30 publications

Reinforcement Learning Interpretation Methods: A Survey

Reinforcement Learning Interpretation Methods: A Survey

Interpretable policy derivation for reinforcement learning based on evolutionary feature synthesis

Generation of Policy-Level Explanations for Reinforcement Learning

Contact Info

Product

Resources

About