DOI: 10.29007/s8jk
|View full text |Cite
|
Sign up to set email alerts
|

Learning to Plan from Raw Data in Grid-based Games

Abstract: An agent that autonomously learns to act in its environment must acquire a model of the domain dynamics. This can be a challenging task, especially in real-world domains, where observations are high-dimensional and noisy. Although in automated planning the dynamics are typically given, there are action schema learning approaches that learn sym- bolic rules (e.g. STRIPS or PDDL) to be used by traditional planners. However, these algorithms rely on logical descriptions of environment observations. In contrast, r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 15 publications
0
5
0
Order By: Relevance
“…The experimental results show that SPOTTER achieved higher overall rewards than the baselines in the given time frame and did so more quickly. 7 Crucially, the agent learned the missing operator for moving the blue ball out of the way in Level 2, and was immediately able to use this operator in Level 3. This is demonstrated both by the fact that the agent did not experience any drop in performance when transitioning to Level 3 and also we know from running the experiment that the agent did not enter learn or gen-precon in Level 3.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…The experimental results show that SPOTTER achieved higher overall rewards than the baselines in the given time frame and did so more quickly. 7 Crucially, the agent learned the missing operator for moving the blue ball out of the way in Level 2, and was immediately able to use this operator in Level 3. This is demonstrated both by the fact that the agent did not experience any drop in performance when transitioning to Level 3 and also we know from running the experiment that the agent did not enter learn or gen-precon in Level 3.…”
Section: Methodsmentioning
confidence: 99%
“…Accordingly, we did not compare against any deep RL baselines. We also did not compare transfer learning and curriculum learning approaches as these approaches 7 Code implementing SPOTTER and the baselines along with experiments will be made available post-review. 8 In the supplementary material, we provide the learned operator described in PDDL, learning curves for the baselines over 2,000,000 episodes, and videos showing SPOT-TER's integrated planning and learning.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Neural networks have also been applied to other aspects of planning. For instance, (Dittadi et al, 2018) trains a NN that learns a planning domain just from visual observations, assuming that actions have local preconditions and effects. The learned domain is generalizable across different problems of the same domain and, thus, can be used by a planner to solve these problems.…”
Section: Related Workmentioning
confidence: 99%
“…Neural networks have also been applied to other aspects of planning. For instance, (Dittadi, Bolander, and Winther 2018) trains a NN that learns a planning domain just from visual observations, assuming that actions have local preconditions and effects. The learnt domain is generalizable across different problems of the same domain and, thus, can be used by a planner to solve these problems.…”
Section: Related Workmentioning
confidence: 99%