Abstract-This paper addresses the problem of planning a safe (i.e., collision-free) trajectory from an initial state to a goal region when the obstacle space is a-priori unknown and is incrementally revealed online, e.g., through line-of-sight perception. Despite its ubiquitous nature, this formulation of motion planning has received relatively little theoretical investigation, as opposed to the setup where the environment is assumed known. A fundamental challenge is that, unlike motion planning with known obstacles, it is not even clear what an optimal policy to strive for is. Our contribution is threefold. First, we present a notion of optimality for safe planning in unknown environments in the spirit of comparative (as opposed to competitive) analysis, with the goal of obtaining a benchmark that is, at least conceptually, attainable. Second, by leveraging this theoretical benchmark, we derive a pseudo-optimal class of policies that can seamlessly incorporate any amount of prior or learned information while still guaranteeing the robot never collides. Finally, we demonstrate the practicality of our algorithmic approach in numerical experiments using a range of environment types and dynamics, including a comparison with a state of the art method. A key aspect of our framework is that it automatically and implicitly weighs exploration versus exploitation in a way that is optimal with respect to the information available.
I. INTRODUCTIONThis paper addresses the problem of planning a safe (i.e., collision-free) trajectory from an initial state to a goal region when the obstacles in between are a priori unknown and are instead revealed online, e.g., through line-of-sight perception. A fundamental challenge is that, unlike motion planning with known obstacles or obstacles whose uncertainty is fully modeled probabilistically, when parts of the configuration space are simply unknown it is not even clear what an optimal policy to strive for (through, e.g., asymptotic convergence) is. One of the main goals of this paper is to address this shortcoming in the literature by defining a notion of optimality for use as a benchmark and also for conceptual guidance. We then follow this conceptual guidance to propose a novel algorithm for planning in unknown environments that produces low-cost solutions by flexibly incorporating side information about the environment, is guaranteed to be collision-free (even if the side information is incorrect), and requires on the order of 0.5-1s of (serial) computation time per action.Related Work: Most algorithms for motion planning assume full knowledge of the environment, and can not be used, at least directly, to find a motion plan within an incomplete map [17,18]. A common heuristic approach is to set temporary goals along the way to the goal region, and re-compute in