Philippe Morere scite author profile

Marchant

2017

Bayesian Optimisation has gained much popularity lately, as a global optimisation technique for functions that are expensive to evaluate or unknown a priori. While classical BO focuses on where to gather an observation next, it does not take into account practical constraints for a robotic system such as where it is physically possible to gather samples from, nor the sequential nature of the problem while executing a trajectory. In field robotics and other real-life situations, physical and trajectory constraints are inherent problems. This paper addresses these issues by formulating Bayesian Optimisation for continuous trajectories within a Partially Observable Markov Decision Process (POMDP) framework. The resulting POMDP is solved using Monte-Carlo Tree Search (MCTS), which we adapt to using a reward function balancing exploration and exploitation. Experiments on monitoring a spatial phenomenon with a UAV illustrate how our BO-POMDP algorithm outperforms competing techniques.

show abstract

Bayesian Local Sampling-Based Planning

Lai

IEEE Robot. Autom. Lett.

et al. 2020

Continuous State-Action-Observation POMDPs for Trajectory Planning with Bayesian Optimisation

Marchant

2018

Learning to Plan Hierarchically From Curriculum

Ott

IEEE Robot. Autom. Lett.

2019

We present a framework for learning to plan hierarchically in domains with unknown dynamics. We enhance planning performance by exploiting problem structure in several ways: (i) We simplify the search over plans by leveraging knowledge of skill objectives, (ii) Shorter plans are generated by enforcing aggressively hierarchical planning, (iii) We learn transition dynamics with sparse local models for better generalisation. Our framework decomposes transition dynamics into skill effects and success conditions, which allows fast planning by reasoning on effects, while learning conditions from interactions with the world. We propose a simple method for learning new abstract skills, using successful trajectories stemming from completing the goals of a curriculum. Learned skills are then refined to leverage other abstract skills and enhance subsequent planning. We show that both conditions and abstract skills can be learned simultaneously while planning, even in stochastic domains. Our method is validated in experiments of increasing complexity, with up to 2 100 states, showing superior planning to classic nonhierarchical planners or reinforcement learning methods. Applicability to real-world problems is demonstrated in a simulationto-real transfer experiment on a robotic manipulator.

show abstract

Learning from Demonstration without Demonstrations

Blau

Francis

2021