2005
DOI: 10.1007/s10994-005-0460-9
|View full text |Cite
|
Sign up to set email alerts
|

Evolving Soccer Keepaway Players Through Task Decomposition

Abstract: Abstract. Complex control tasks can often be solved by decomposing them into hierarchies of manageable subtasks. Such decompositions require designers to decide how much human knowledge should be used to help learn the resulting components. On one hand, encoding human knowledge requires manual effort and may incorrectly constrain the learner's hypothesis space or guide it away from the best solutions. On the other hand, it may make learning easier and enable the learner to tackle more complex tasks. This artic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
36
0

Year Published

2005
2005
2018
2018

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 85 publications
(36 citation statements)
references
References 28 publications
0
36
0
Order By: Relevance
“…In particular, gradient-based subgoal discovery with FNNs or RNNs decomposes RL tasks into subtasks for RL submodules (Schmidhuber, 1991b;Schmidhuber and Wahnsiedler, 1992). Numerous alternative HRL techniques have been proposed (e.g., Ring, 1991Ring, , 1994Jameson, 1991;Tenenberg et al, 1993;Weiss, 1994;Moore and Atkeson, 1995;Precup et al, 1998;Dietterich, 2000b;Menache et al, 2002;Doya et al, 2002;Ghavamzadeh and Mahadevan, 2003;Barto and Mahadevan, 2003;Samejima et al, 2003;Bakker and Schmidhuber, 2004;Whiteson et al, 2005;Simsek and Barto, 2008). While HRL frameworks such as Feudal RL (Dayan and Hinton, 1993) and options (Sutton et al, 1999b;Barto et al, 2004;Singh et al, 2005) do not directly address the problem of automatic subgoal discovery, HQ-Learning (Wiering and Schmidhuber, 1998a) automatically decomposes POMDPs (Sec.…”
Section: Deep Hierarchical Rl (Hrl) and Subgoal Learning With Fnns Anmentioning
confidence: 99%
“…In particular, gradient-based subgoal discovery with FNNs or RNNs decomposes RL tasks into subtasks for RL submodules (Schmidhuber, 1991b;Schmidhuber and Wahnsiedler, 1992). Numerous alternative HRL techniques have been proposed (e.g., Ring, 1991Ring, , 1994Jameson, 1991;Tenenberg et al, 1993;Weiss, 1994;Moore and Atkeson, 1995;Precup et al, 1998;Dietterich, 2000b;Menache et al, 2002;Doya et al, 2002;Ghavamzadeh and Mahadevan, 2003;Barto and Mahadevan, 2003;Samejima et al, 2003;Bakker and Schmidhuber, 2004;Whiteson et al, 2005;Simsek and Barto, 2008). While HRL frameworks such as Feudal RL (Dayan and Hinton, 1993) and options (Sutton et al, 1999b;Barto et al, 2004;Singh et al, 2005) do not directly address the problem of automatic subgoal discovery, HQ-Learning (Wiering and Schmidhuber, 1998a) automatically decomposes POMDPs (Sec.…”
Section: Deep Hierarchical Rl (Hrl) and Subgoal Learning With Fnns Anmentioning
confidence: 99%
“…The probable locations of the opponent team members were modeled. Further, Whiteson et al [15] worked on RoboCup Keepaway, where three players must keep the ball away from a fourth player. Implicit opponent models are needed in order to avoid the fourth player.…”
Section: Related Workmentioning
confidence: 99%
“…Keepaway is an appealing platform for empirical comparisons because the performance of TD methods has already been established in previous studies [8,23]. While GAs have been applied to variations of Keepaway [7,26], they have never, to our knowledge, been applied to the task's benchmark version.…”
Section: The Benchmark Keepaway Taskmentioning
confidence: 99%
“…There are a wide variety of both GAs and TD methods in use today but in order to compare these different approaches empirically we must focus on specific instantiations. We use Sarsa and NEAT as representative methods because of their empirical success in the benchmark Keepaway task [23] or variations thereof [26].…”
Section: Introductionmentioning
confidence: 99%