We propose automatically learning probabilistic Hierarchical Task Networks (pH-TNs) in order to capture a user's preferences on plans, by observing only the user's behavior. HTNs are a common choice of representation for a variety of purposes in planning, including work on learning in planning. Our contributions are (a) learning structure and (b) representing preferences. In contrast, prior work employing HTNs considers learning method preconditions (instead of structure) and representing domain physics or search control knowledge (rather than preferences). Initially we will assume that the observed distribution of plans is an accurate representation of user preference, and then generalize to the situation where feasibility constraints frequently prevent the execution of preferred plans. In order to learn a distribution on plans we adapt an Expectation-Maximization (EM) technique from the discipline of (probabilistic) grammar induction, taking the perspective of task reductions as productions in a context-free grammar over primitive actions. To account for the difference between the distributions of possible and preferred plans we subsequently modify this core EM technique, in short, by rescaling its input.
Planning research has returned to the issue of optimizing costs (rather than sizes) of plans. A prevalent perception, at least among non-experts in search, is that graph search for optimizing the size of paths generalizes more or less trivially to optimizing the cost of paths. While this kind of generalization is usually straightforward for graph theorems, graph algorithms are a different story. In particular, implementing a search evaluation function by substituting cost for size is a Bad Idea. Though experts have stated as much, cutting-edge practitioners are still learning of the consequences the hard way; here we mount a forceful indictment on the inherent dangers of cost-based search.
Planning graphs have been shown to be a rich source of heuristic information for many kinds of planners. In many cases, planners must compute a planning graph for each element of a set of states, and the naive technique enumerates the graphs individually. This is equivalent to solving an all-pairs shortest path problem by iterating a single-source algorithm over each source. We introduce a structure, the state agnostic planning graph, that directly solves the all-pairs problem for the relaxation introduced by planning graphs. The technique can also be characterized as exploiting the overlap present in sets of planning graphs. For the purpose of exposition, we first present the technique in deterministic planning. A more prominent application of this technique is in belief state space planning, where an optimization to exploit state overlap between belief states results in drastically improved theoretical complexity. We describe another extension in probabilistic planning that uses common action outcome uncertainty to further improve theoretical complexity. Our experimental evaluation (using many existing International Planning Competition problems) quantifies each of these performance boosts, and demonstrates that heuristic belief state space progression planning using our technique is competitive with the state of the art.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.