Learning Domain-Independent Planning Heuristics with Hypergraph Networks

Shen, William; Trevizan, Felipe W.; Thiébaux, Sylvie

doi:10.1609/icaps.v30i1.6754

Cited by 27 publications

(35 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There has also been an increasing interest in developing deep learning techniques to improve the performance of automated planning (Fern, Khardon, and Tadepalli 2011). For instance, learning policies (Garg, Bajpai, and Mausam 2020;Groshev et al 2018;Toyer et al 2018;Issakkimuthua, Fern, and Tadepalli 2018;Mausam 2019, 2020;Shen et al 2019), planner selection (Sievers et al 2019;Ma et al 2020;Katz et al 2018) and heuristics (Arfaee, Zilles, and Holte 2011;Groshev et al 2018;Samadi, Felner, and Schaeffer 2008;Thayer, Dionne, and Ruml 2011;Garrett, Kaelbling, and Lozano-Pérez 2016;Shen, Trevizan, and Thiébaux 2020) have been widely explored. Our work fits within the heuristic learning category.…”

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

“…Recent methods for learning heuristics combine or improve on existing heuristics (Arfaee, Zilles, and Holte 2011;Groshev et al 2018;Samadi, Felner, and Schaeffer 2008;Thayer, Dionne, and Ruml 2011;Garrett, Kaelbling, and Lozano-Pérez 2016;Shen, Trevizan, and Thiébaux 2020). All of these methods use supervised learning but differ in the encoding of the states, proposing, for instance, the use of images (Groshev et al 2018;Ma et al 2020;Katz et al 2018) or sophisticated network models (Shen, Trevizan, and Thiébaux 2020;Toyer et al 2018). A common approach is to do regression on the heuristic values obtained from precomputed plans (Shen, Trevizan, and Thiébaux 2020;Toyer et al 2018;Garrett, Kaelbling, and Lozano-Pérez 2016;Yoon, Fern, and Givan 2008), and for this reason, it is the baseline we used to compare against supervised methods.…”

Section: Related Workmentioning

confidence: 99%

“…Sokoban Sokoban is a game from IPC-2008, also used to evaluate previous Deep Learning approaches (Shen, Trevizan, and Thiébaux 2020;Groshev et al 2018). The agent must push objects around a grid with the goal of moving them to specific locations.…”

Section: Domainsmentioning

confidence: 99%

“…All rights reserved. et al Toyer et al 2018;Shen, Trevizan, and Thiébaux 2020). These approaches are based on supervised learning, and learn from the optimal plans of previously solved planning problems.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Meta Reinforcement Learning for Heuristic Planing

Gutierrez

Leonetti

2021

ICAPS

View full text Add to dashboard Cite

Heuristic planning has a central role in classical planning applications and competitions. Thanks to this success, there has been an increasing interest in using Deep Learning to create high-quality heuristics in a supervised fashion, learning from optimal solutions of previously solved planning problems. Meta-Reinforcement learning is a fast growing research area concerned with learning, from many tasks, behaviours that can quickly generalize to new tasks from the same distribution of the training ones. We make a connection between meta-reinforcement learning and heuristic planning, showing that heuristic functions meta-learned from planning problems, in a given domain, can outperform both popular domain-independent heuristics, and heuristics learned by supervised learning. Furthermore, while most supervised learning algorithms rely on ad-hoc encodings of the state representation, our method uses as input a general PDDL 3.1 description. We evaluated our heuristic with an A* planner on six domains from the International Planning Competition and the FF Domain Collection, showing that the meta-learned heuristic leads to the expansion, on average, of fewer states than three popular heuristics used by the FastDownward planner, and a supervised-learned heuristic.

show abstract