2017
DOI: 10.1016/j.artint.2015.02.006
|View full text |Cite
|
Sign up to set email alerts
|

Relational reinforcement learning with guided demonstrations

Abstract: Model-based reinforcement learning is a powerful paradigm for learning tasks in robotics. However, in-depth exploration is usually required and the actions have to be known in advance. Thus, we propose a novel algorithm that integrates the option of requesting teacher demonstrations to learn new domains with fewer action executions and no previous knowledge. Demonstrations allow new actions to be learned and they greatly reduce the amount of exploration required, but they are only requested when they are expec… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
27
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
2

Relationship

3
5

Authors

Journals

citations
Cited by 37 publications
(28 citation statements)
references
References 20 publications
1
27
0
Order By: Relevance
“…In contrast, we use the REX-D algorithm [61], which combines relational RL and active demonstration requests. REX-D requests demonstrations only when they can save a lot of time, because teacher's time is considered to be very valuable, and uses autonomous exploration otherwise.…”
Section: High-level Planning System and Executionmentioning
confidence: 99%
See 1 more Smart Citation
“…In contrast, we use the REX-D algorithm [61], which combines relational RL and active demonstration requests. REX-D requests demonstrations only when they can save a lot of time, because teacher's time is considered to be very valuable, and uses autonomous exploration otherwise.…”
Section: High-level Planning System and Executionmentioning
confidence: 99%
“…We introduced the REX-D algorithm [61] to address the learning phase, which is an efficient model-based reinforcement learning (RL) method combined with additional human demonstrations upon request. It can take three alternative strategies: one is to explore the state space to improve the model and achieve better rewards in the long term; another is to exploit the available knowledge by executing the manipulations that maximize the reward with the current learned model [76]; and the last one is to request a demonstration from the teacher [60].…”
Section: B Learning On the Planning Levelmentioning
confidence: 99%
“…Generally, in PS approaches, the robot is taught an initial trajectory that is then improved through autonomously-generated rollouts and policy updates [7]. A model-based RL proved to be efficient for learning action sequences with the user in the role of a teacher [19]. Instead of applying RL to teach the robot new actions and their effects, in this work, RL is applied to modify the robot trajectory segment selected by the user.…”
Section: Relevant Workmentioning
confidence: 99%
“…MDP is also used to clear objects from a table in fully-observable problems with uncertainty [27]. The same authors employ REX-D algorithm that integrates active teacher demonstration for increasing learning speed in order to sweep lentils from a plane [28]. Interactive RL approach with contextual affordances is developed by Cruz et al to clean a table using state-action-reward-state-action (SARSA) [7].…”
mentioning
confidence: 99%