OGA-UCT: On-the-Go Abstractions in UCT

Anand, Ankit; Noothigattu, Ritesh; Singla, Parag

doi:10.1609/icaps.v26i1.13745

Cited by 10 publications

(7 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, the bisimulation metric is difficult to satisfy completely and is not robust to small changes in the transition probabilities and rewards Moerland et al (2023). Safe state abstraction methods avoid the non-Markovian problem by ignoring irrelevant state variables Andre and Russell (2002); Anand et al (2015), thus yield state abstraction while maintaining hierarchical optimality, i.e., optimality among all policies consistent with the partial program. One method to resolve this situation is to introduce an ad hoc weighting function (the aggregation probability), which functions like an occupancy probability for each concrete state given the abstract state.…”

Section: Markovian Abstractionmentioning

confidence: 99%

Mori-Zwanzig Approach for Belief Abstraction with Application to Belief Space Planning

Hou,

Lin,

Zhou

et al. 2023

Preprint

View full text Add to dashboard Cite

We propose a learning-based method to extract symbolic representations of the belief state and its dynamics in order to solve planning problems in a continuous-state partially observable Markov decision processes (POMDP) problem. While existing approaches typically parameterize the continuous-state POMDP into a finite-dimensional Markovian model, they are unable to preserve fidelity of the abstracted model. The first major contribution of this paper is we propose a Neural Network based method to learn the non-Markovian transition model based on the Mori-Zwanzig (M-Z) formalism. Different from existing work in applying M-Z formalism to autonomous time-invariant systems, our approach is the first work generalizing the M-Z formalism to robotics, by addressing the non-Markovian modeling of the belief dynamics that is dependent on historical observations and actions. The second major contribution is we theoretically show that modeling the non-Markovian memory effect in the abstracted belief dynamics improves the modeling accuracy, which is the key benefit of the proposed algorithm. Simulation experiment of a belief space planning problem is provided to validate the performance of the proposed belief abstraction algorithms.

show abstract

Section: Markovian Abstractionmentioning

confidence: 99%

Mori-Zwanzig Approach for Belief Abstraction with Application to Belief Space Planning

Hou,

Lin,

Zhou

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…This group is gradually divided into different groups to refine the state abstraction. On-The-Go abstraction [13] updates the abstraction more frequently, avoiding the influence of delayed samples. A recent visit count is kept for each node and the abstraction mapping for a node is re-computed once the recent visit count reached a threshold.…”

Section: Related Workmentioning

confidence: 99%

“…The proposed method improved the agent's performance in the game Othello when combined with further action abstraction and tree pruning. [11] extended the application of approximate MDP homomorphism from state abstraction to state-action abstraction where the state-action pairs are grouped. In our paper, we adapt approximate MDP homomorphism from [9] to perform in the more challenging domain of strategy games, without providing game expert knowledge.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Elastic Monte Carlo Tree Search

Dockhorn

Peŕez-Liebana

2023

IEEE Trans. Games

View full text Add to dashboard Cite

Strategy games are a challenge for the design of AI agents due to their complexity and the combinatorial search space they produce. State abstraction has been applied in different domains to shrink the search space. Automatic state abstraction methods have gained much success in the planning domain and their transfer to strategy games raises a question of scalability. In this paper, we propose Elastic MCTS, an algorithm that uses automatic state abstraction to play strategy games. In Elastic MCTS, tree nodes are clustered dynamically. First, nodes are grouped by state abstraction for efficient exploration, to later be separated for refining exploitable action sequences. Such an elastic tree benefits from efficient information sharing while avoiding using an imperfect state abstraction during the whole search process. We provide empirical analyses of the proposed method in three strategy games of different complexity. Our empirical results show that in all games, Elastic MCTS outperforms MCTS baselines by a large margin, with a considerable search tree size reduction at the expense of small computation time. The code for reproducing the reported results can be found at https://github.com/GAIGResearch/Stratega.

show abstract

“…Current state-of-theart planners for RDDL problems are based on online search, where at each step some combination of search and reasoning is used to select an action. For example, there are planners based on sample-based tree search (Keller and Eyerich 2012;Kolobov et al 2012;Bonet and Geffner 2012), symbolic variants (Cui et al 2015;Raghavan et al 2015;Anand et al 2016), and those that construct and solve integer linear programs at each step (Issakkimuthu et al 2015). These planners can require non-trivial computation time per step, which can make them inapplicable to problems that require fast decisions.…”

Section: Introductionmentioning

confidence: 99%

Training Deep Reactive Policies for Probabilistic Planning Problems

Issakkimuthu

Fern

Tadepalli

2018

ICAPS

View full text Add to dashboard Cite

State-of-the-art probabilistic planners typically apply look- ahead search and reasoning at each step to make a decision. While this approach can enable high-quality decisions, it can be computationally expensive for problems that require fast decision making. In this paper, we investigate the potential for deep learning to replace search by fast reactive policies. We focus on supervised learning of deep reactive policies for probabilistic planning problems described in RDDL. A key challenge is to explore the large design space of network architectures and training methods, which was critical to prior deep learning successes. We investigate a number of choices in this space and conduct experiments across a set of benchmark problems. Our results show that effective deep reactive policies can be learned for many benchmark problems and that leveraging the planning problem description to define the network structure can be beneficial.

show abstract

OGA-UCT: On-the-Go Abstractions in UCT

Cited by 10 publications

References 13 publications

Mori-Zwanzig Approach for Belief Abstraction with Application to Belief Space Planning

Mori-Zwanzig Approach for Belief Abstraction with Application to Belief Space Planning

Elastic Monte Carlo Tree Search

Training Deep Reactive Policies for Probabilistic Planning Problems

Contact Info

Product

Resources

About