2017
DOI: 10.1007/978-3-319-59394-4_18
|View full text |Cite
|
Sign up to set email alerts
|

Pseudorehearsal in Value Function Approximation

Abstract: Catastrophic forgetting is of special importance in reinforcement learning, as the data distribution is generally non-stationary over time. We study and compare several pseudorehearsal approaches for Qlearning with function approximation in a pole balancing task. We have found that pseudorehearsal seems to assist learning even in such very simple problems, given proper initialization of the rehearsal parameters.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2017
2017
2018
2018

Publication Types

Select...
2

Relationship

2
0

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 23 publications
0
2
0
Order By: Relevance
“…PR is a simple and computationally efficient method for solving CF problem which is proven to be successful in unsupervised learning [17], supervised learning problems [21], [16] and sometimes in reinforcement learning as well [22], [14], [23]. It is interesting to note that the results of Baddeley suggest, that the widely studied ill conditioning might not be the main bottleneck of reinforcement learning while CF may be.…”
Section: Pseudorehearsalmentioning
confidence: 99%
“…PR is a simple and computationally efficient method for solving CF problem which is proven to be successful in unsupervised learning [17], supervised learning problems [21], [16] and sometimes in reinforcement learning as well [22], [14], [23]. It is interesting to note that the results of Baddeley suggest, that the widely studied ill conditioning might not be the main bottleneck of reinforcement learning while CF may be.…”
Section: Pseudorehearsalmentioning
confidence: 99%
“…We have shown that in Q-learning algorithms pseudorehearsal can improve performance significantly. [1] and now want to test it on more interesting and complex actor-critic algorithm. Actor-critic methods are one of the types of reinforcement learning model-based algorithms based on TD-learning.…”
Section: Introductionmentioning
confidence: 99%