2013
DOI: 10.1007/s10458-013-9235-z
|View full text |Cite
|
Sign up to set email alerts
|

Learning potential functions and their representations for multi-task reinforcement learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
11
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 34 publications
(12 citation statements)
references
References 54 publications
1
11
0
Order By: Relevance
“…We provide insights into optimal reward shaping, and propose a novel meta-learning framework to automatically learn such reward shaping to apply on newly sampled tasks. Theoretical analysis and extensive experiments establish us as the state-of-the-art in learning task-distribution reward shaping, outperforming previous such works ( Konidaris and Barto 2006;Snel and Whiteson 2014). We further show that our method outperforms learning intrinsic rewards (Yang et al 2019;Zheng et al 2020), outperforms Rainbow (Hessel et al 2018) in complex pixel-based CoinRun games, and is also better than hand-designed reward shaping on grid mazes.…”
supporting
confidence: 57%
“…We provide insights into optimal reward shaping, and propose a novel meta-learning framework to automatically learn such reward shaping to apply on newly sampled tasks. Theoretical analysis and extensive experiments establish us as the state-of-the-art in learning task-distribution reward shaping, outperforming previous such works ( Konidaris and Barto 2006;Snel and Whiteson 2014). We further show that our method outperforms learning intrinsic rewards (Yang et al 2019;Zheng et al 2020), outperforms Rainbow (Hessel et al 2018) in complex pixel-based CoinRun games, and is also better than hand-designed reward shaping on grid mazes.…”
supporting
confidence: 57%
“…However, potential functions will typically need to be pre-specified. This has restricted the use of PBRS to tabular/ low-dimensional state spaces [15,38]. The cycling problem (repeatedly visiting certain states) mention in Section 4.1 can be resolved by PBRS.…”
Section: Related Workmentioning
confidence: 99%
“…The multi-task community has roughly separated learnable knowledge into two categories [44]. Task relevant knowledge pertains to a particular task [23,51]; meanwhile, domain relevant knowledge is common across all tasks [6,14,25].…”
Section: Multi-task Learningmentioning
confidence: 99%