2018
DOI: 10.1016/j.engappai.2018.09.007
|View full text |Cite
|
Sign up to set email alerts
|

Interpretable policies for reinforcement learning by genetic programming

Abstract: The search for interpretable reinforcement learning policies is of high academic and industrial interest. Especially for industrial systems, domain experts are more likely to deploy autonomously learned controllers if they are understandable and convenient to evaluate. Basic algebraic equations are supposed to meet these requirements, as long as they are restricted to an adequate complexity. Here we introduce the genetic programming for reinforcement learning (GPRL) approach based on model-based batch reinforc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
59
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 102 publications
(67 citation statements)
references
References 30 publications
(36 reference statements)
0
59
0
Order By: Relevance
“…Similarly, Hein et al [85] introduced another algebraic based method for RL interpretation but with using genetic algorithms. The authors collected data of four dimensions that include: current state, action, next state and the received reward, and then trained their algorithm for two Atari games:…”
Section: ) Algebraic Languagesmentioning
confidence: 99%
“…Similarly, Hein et al [85] introduced another algebraic based method for RL interpretation but with using genetic algorithms. The authors collected data of four dimensions that include: current state, action, next state and the received reward, and then trained their algorithm for two Atari games:…”
Section: ) Algebraic Languagesmentioning
confidence: 99%
“…The incorporation of the model-based reinforcement learning and the symbolic regression algorithm has been studied in [11]. Reference [11] introduced a method of evolving interpretable policies based on GP. A neural network that builds based on trajectory data is used as a world model to evaluate the fitness value of generated symbolic policies.…”
Section: Related Workmentioning
confidence: 99%
“…The idea is to evolve an explainable model to extract the policy from the deep neural network. In [11], an explainable reinforcement learning policy model is built by using the tree-based genetic programming (GP) [24] algorithm. However, it is argued in [11] that it is hard for GP to mimic the behavior of the deep neural network.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Some works explain agent behavior, but require the agent to use a specific, interpretable model. For example, Genetic Programming for Reinforcement Learning (Hein, Udluft, and Runkler 2017) use a genetic algorithm to learn a policy which is inherently explainable. Unlike our method, this method is incompatible with arbitrary RL systems due to its reliance on learning inherently small policies using a genetic algorithm.…”
Section: Introductionmentioning
confidence: 99%