DOI: 10.26481/dis.20130613hb
|View full text |Cite
|
Sign up to set email alerts
|

Automated transfer in reinforcement learning

Abstract: People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website.• The final author version and the galley proof are versions of the publication after peer review.• The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
26
0

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 18 publications
(27 citation statements)
references
References 26 publications
1
26
0
Order By: Relevance
“…They use their approach for policy transfer, which differs from the value-transfer method proposed in this paper. Ammar et al (2014) learn the model of a source MDP and view the prediction error on a target MDP as a dissimilarity measure in the task space.…”
Section: Background and Related Workmentioning
confidence: 99%
“…They use their approach for policy transfer, which differs from the value-transfer method proposed in this paper. Ammar et al (2014) learn the model of a source MDP and view the prediction error on a target MDP as a dissimilarity measure in the task space.…”
Section: Background and Related Workmentioning
confidence: 99%
“…Task Selection has been studied for general transfer learning, and presents common aspects with the task selection that is part of sequencing in curriculum learning. Several approaches consider learning a mapping from source tasks to target tasks, and estimating the benefit of transferring between the tasks [22]- [24]. Nonetheless, transfer learning is usually performed between two tasks, a source and a target, and task selection methods have never been leveraged to achieve longer sequences.…”
Section: Related Workmentioning
confidence: 99%
“…Consequently, numerous techniques have been proposed [17,30,35] to efficiently reuse the knowledge of learned tasks. A number of these [6,3,26] rely on a measure of similarity between MDPs in order to choose an appropriate source task to transfer from. However, this can be problematic, as no such universal metric exists [6], and some of the useful ones may be computationally expensive [3].…”
Section: Related Workmentioning
confidence: 99%
“…A number of these [6,3,26] rely on a measure of similarity between MDPs in order to choose an appropriate source task to transfer from. However, this can be problematic, as no such universal metric exists [6], and some of the useful ones may be computationally expensive [3]. In the present work, the similarity metric used is computationally inexpensive, and the degree of similarity between two tasks is based solely on the value function weights associated with them.…”
Section: Related Workmentioning
confidence: 99%