2016
DOI: 10.1609/icaps.v26i1.13745
|View full text |Cite
|
Sign up to set email alerts
|

OGA-UCT: On-the-Go Abstractions in UCT

Abstract: Recent work has begun exploring the value of domain abstractions in Monte-Carlo Tree Search (MCTS) algorithms for probabilistic planning. These algorithms automatically aggregate symmetric search nodes (states or state-action pairs) saving valuable planning time. Existing algorithms alternate between two phases: (1) abstraction computation forcomputing node aggregations, and (2) modified MCTS that use aggregate nodes. We believe that these algorithms do not achieve the full potential of abstractions because o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 13 publications
0
7
0
Order By: Relevance
“…However, the bisimulation metric is difficult to satisfy completely and is not robust to small changes in the transition probabilities and rewards Moerland et al (2023). Safe state abstraction methods avoid the non-Markovian problem by ignoring irrelevant state variables Andre and Russell (2002); Anand et al (2015), thus yield state abstraction while maintaining hierarchical optimality, i.e., optimality among all policies consistent with the partial program. One method to resolve this situation is to introduce an ad hoc weighting function (the aggregation probability), which functions like an occupancy probability for each concrete state given the abstract state.…”
Section: Markovian Abstractionmentioning
confidence: 99%
“…However, the bisimulation metric is difficult to satisfy completely and is not robust to small changes in the transition probabilities and rewards Moerland et al (2023). Safe state abstraction methods avoid the non-Markovian problem by ignoring irrelevant state variables Andre and Russell (2002); Anand et al (2015), thus yield state abstraction while maintaining hierarchical optimality, i.e., optimality among all policies consistent with the partial program. One method to resolve this situation is to introduce an ad hoc weighting function (the aggregation probability), which functions like an occupancy probability for each concrete state given the abstract state.…”
Section: Markovian Abstractionmentioning
confidence: 99%
“…This group is gradually divided into different groups to refine the state abstraction. On-The-Go abstraction [13] updates the abstraction more frequently, avoiding the influence of delayed samples. A recent visit count is kept for each node and the abstraction mapping for a node is re-computed once the recent visit count reached a threshold.…”
Section: Related Workmentioning
confidence: 99%
“…The proposed method improved the agent's performance in the game Othello when combined with further action abstraction and tree pruning. [11] extended the application of approximate MDP homomorphism from state abstraction to state-action abstraction where the state-action pairs are grouped. In our paper, we adapt approximate MDP homomorphism from [9] to perform in the more challenging domain of strategy games, without providing game expert knowledge.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Current state-of-theart planners for RDDL problems are based on online search, where at each step some combination of search and reasoning is used to select an action. For example, there are planners based on sample-based tree search (Keller and Eyerich 2012;Kolobov et al 2012;Bonet and Geffner 2012), symbolic variants (Cui et al 2015;Raghavan et al 2015;Anand et al 2016), and those that construct and solve integer linear programs at each step (Issakkimuthu et al 2015). These planners can require non-trivial computation time per step, which can make them inapplicable to problems that require fast decisions.…”
Section: Introductionmentioning
confidence: 99%