Abstract:In real-time domains such as video games, planning happens concurrently with execution and the planning algorithm has a strictly bounded amount of time before it must return the next action for the agent to execute. We explore the use of real-time heuristic search in two benchmark domains inspired by video games. Unlike classic benchmarks such as grid pathfinding and the sliding tile puzzle, these new domains feature exogenous change and directed state space graphs. We consider the setting in which planning an… Show more
“…One metric we will discuss is goal achievement time (GAT), which is the overall time spent on planning and execution of the plan (Kiesel, Burns, and Ruml 2015). We assume the execution of an action takes exactly one time step.…”
Section: Methodsmentioning
confidence: 99%
“…Real-time search is a wellestablished area addressing this through dedicated heuristic search algorithms (e.g. Korf 1990; Bulitko and Lee 2006;Koenig and Sun 2009;Bulitko et al 2011;Hernández and Baier 2012;Kiesel, Burns, and Ruml 2015). One important issue in this context is completeness: guaranteeing that the agent will eventually reach a goal state.…”
In real-time planning, the planner must select the next action within a fixed time bound. Because a complete plan may not have been found, the selected action might not lead to a goal and the agent may need to return to its current state. To preserve completeness, real-time search methods incorporate learning, in which heuristic values are updated. Previous work in real-time search has used table-based heuristics, in which the values of states are updated individually. In this paper, we explore the use of abstraction-based heuristics. By refining the abstraction on-line, we can update the values of multiple states, including ones the agent has not yet generated. We test this idea empirically using Cartesian abstractions in the Fast Downward planner. Results on various benchmarks, including the sliding tile puzzle and several IPC domains, indicate that the approach can improve performance compared to traditional heuristic updating. This work brings abstraction refinement, a powerful technique from offline planning, into the real-time setting.
“…One metric we will discuss is goal achievement time (GAT), which is the overall time spent on planning and execution of the plan (Kiesel, Burns, and Ruml 2015). We assume the execution of an action takes exactly one time step.…”
Section: Methodsmentioning
confidence: 99%
“…Real-time search is a wellestablished area addressing this through dedicated heuristic search algorithms (e.g. Korf 1990; Bulitko and Lee 2006;Koenig and Sun 2009;Bulitko et al 2011;Hernández and Baier 2012;Kiesel, Burns, and Ruml 2015). One important issue in this context is completeness: guaranteeing that the agent will eventually reach a goal state.…”
In real-time planning, the planner must select the next action within a fixed time bound. Because a complete plan may not have been found, the selected action might not lead to a goal and the agent may need to return to its current state. To preserve completeness, real-time search methods incorporate learning, in which heuristic values are updated. Previous work in real-time search has used table-based heuristics, in which the values of states are updated individually. In this paper, we explore the use of abstraction-based heuristics. By refining the abstraction on-line, we can update the values of multiple states, including ones the agent has not yet generated. We test this idea empirically using Cartesian abstractions in the Fast Downward planner. Results on various benchmarks, including the sliding tile puzzle and several IPC domains, indicate that the approach can improve performance compared to traditional heuristic updating. This work brings abstraction refinement, a powerful technique from offline planning, into the real-time setting.
“…LSS-LRTA* follows the A* convention to order nodes by f value, which is cost-to-come g plus the lower bound estimate on cost-to-go h. However, as pointed out by Mutchler (1986), expanding the frontier node with lowest f is not necessary the optimal way to make use of a limited number of node expansions because f does not take any heuristic error into account. A better alternative is to sort the open list byf , which denotes an estimate of the expected value of f * , rather than a lower bound (Kiesel, Burns, and Ruml 2015). This better matches the principle of rationality, which stipulates minimizing expected cost.…”
Suboptimal heuristic search algorithms can benefit from reasoning about heuristic error, especially in a real-time setting where there is not enough time to search all the way to a goal. However, current reasoning methods implicitly or explicitly incorporate assumptions about the cost-to-go function. We consider a recent real-time search algorithm, called Nancy, that manipulates explicit beliefs about the cost-to-go. The original presentation of Nancy assumed that these beliefs are Gaussian, with parameters following a certain form. In this paper, we explore how to replace these assumptions with actual data. We develop a data-driven variant of Nancy, DDNancy, that bases its beliefs on heuristic performance statistics from the same domain. We extend Nancy and DDNancy with the notion of persistence and prove their completeness. Experimental results show that DDNancy can perform well in domains in which the original assumption-based Nancy performs poorly.
“…10 randomly sampled start positions were used for both instances we tested on. In the traffic domain, an extension of the domain used by Kiesel, Burns, and Ruml (2015), a agent moves in a grid, avoiding moving obstacles. A deadend is reached if an obstacle collides with the agent before it reaches a goal state.…”
A fundamental concern in real-time planning is the presence of dead-ends in the state space, from which no goal is reachable. Recently, the SafeRTS algorithm was proposed for searching in such spaces. SafeRTS exploits a user-provided predicate to identify safe states, from which a goal is likely reachable, and attempts to maintain a backup plan for reaching a safe state at all times. In this paper, we study the SafeRTS approach, identify certain properties of its behavior, and design an improved framework for safe real-time search. We prove that the new approach performs at least as well as SafeRTS and present experimental results showing that its promise is fulfilled in practice.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.