Machine Learning Proceedings 1995 1995
DOI: 10.1016/b978-1-55860-377-6.50021-9
|View full text |Cite
|
Sign up to set email alerts
|

Fast and Efficient Reinforcement Learning with Truncated Temporal Differences

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

1996
1996
2020
2020

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 15 publications
(16 citation statements)
references
References 10 publications
0
13
0
Order By: Relevance
“…However, the number of state-action pairs for which actual updating is required can be kept at a manageable level by maintaining only those state-action pairs whose activity trace (3,A) '~ is significant, since this quantity declines exponentially when "yA < 1. For a more elaborate procedure see Cichosz & Mulawka (1995). Another approach is to implement a Q(A)-learning system on a parallel machine in which each state-action pair is mapped onto a separate processor.…”
Section: R~ -(~T(xt At) = E' T + 7)~et+l + 72/~2et+2 +"" (X)mentioning
confidence: 99%
“…However, the number of state-action pairs for which actual updating is required can be kept at a manageable level by maintaining only those state-action pairs whose activity trace (3,A) '~ is significant, since this quantity declines exponentially when "yA < 1. For a more elaborate procedure see Cichosz & Mulawka (1995). Another approach is to implement a Q(A)-learning system on a parallel machine in which each state-action pair is mapped onto a separate processor.…”
Section: R~ -(~T(xt At) = E' T + 7)~et+l + 72/~2et+2 +"" (X)mentioning
confidence: 99%
“…For all algorithms 7 was set to 0.95 and T to 0.01. These values appeared reasonable based on prior work with TTD [2,4].…”
Section: Experimental Design and Resultsmentioning
confidence: 90%
“…This can be performed either iteratively, based directly on the definition of TTD returns, or in an incremental manner, which is particularly efficient. The appropriate algorithms are described in detail in the existing TTD literature [2,4].…”
Section: Ttd Proceduresmentioning
confidence: 99%
See 2 more Smart Citations