1995
DOI: 10.1613/jair.135
|View full text |Cite
|
Sign up to set email alerts
|

Truncating Temporal Differences: On the Efficient Implementation of TD(lambda) for Reinforcement Learning

Abstract: Temporal di erence (TD) methods constitute a class of methods for learning predictions in multi-step prediction problems, parameterized by a recency factor . Currently the most important application of these methods is to temporal credit assignment in reinforcement learning. Well known reinforcement learning algorithms, such as AHC or Q-learning, may be viewed as instances of TD learning. This paper examines the issues of the e cient and general implementation of TD( ) for arbitrary , for use with reinforcemen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
28
0
2

Year Published

1996
1996
2023
2023

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 44 publications
(31 citation statements)
references
References 18 publications
(41 reference statements)
1
28
0
2
Order By: Relevance
“…The TTD procedure [2,4] allows one to implement TD-based reinforcement learning algorithms in their TD()~ > 0) versions without conceptually appealing, but computationally demanding eligibility traces [1,10,4]. Only its particular instantiation for the Q-learning algorithm is discussed below, but modifications for other RL algorithms are straightforward [2].…”
Section: Ttd Proceduresmentioning
confidence: 99%
See 4 more Smart Citations
“…The TTD procedure [2,4] allows one to implement TD-based reinforcement learning algorithms in their TD()~ > 0) versions without conceptually appealing, but computationally demanding eligibility traces [1,10,4]. Only its particular instantiation for the Q-learning algorithm is discussed below, but modifications for other RL algorithms are straightforward [2].…”
Section: Ttd Proceduresmentioning
confidence: 99%
“…Only its particular instantiation for the Q-learning algorithm is discussed below, but modifications for other RL algorithms are straightforward [2].…”
Section: Ttd Proceduresmentioning
confidence: 99%
See 3 more Smart Citations