Bayesian reinforcement learning in continuous POMDPs with gaussian processes

Dallaire, Patrick; Besse, Camille; Ross, Stéphane; Chaib-draa, Brahim

doi:10.1109/iros.2009.5354013

Cited by 26 publications

(26 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Planning under uncertainty and mapping are combined in approaches that attempt to simultaneously capture the environment model and optimize the policy [33,139]. Up to now, those approaches are only valid for relatively small problems.…”

Section: Bibliographical Notesmentioning

confidence: 99%

Mapping, Planning and Exploration with Pose SLAM

Valencia,

Andrade-Cetto

2018

Springer Tracts in Advanced Robotics

View full text Add to dashboard Cite

This thesis reports research on mapping, path planning, and autonomous exploration. These are classical problems in robotics, typically studied independently, and here we link such problems by framing them within a common SLAM approach, adopting Pose SLAM as the basic state estimation machinery. The main contribution of this thesis is an approach that allows a mobile robot to plan a path using the map it builds with Pose SLAM and to select the appropriate actions to autonomously construct this map.Pose SLAM is the variant of SLAM where only the robot trajectory is estimated and where landmarks are only used to produce relative constraints between robot poses. In Pose SLAM, observations come in the form of relative-motion measurements between robot poses. With regards to extending the original Pose SLAM formulation, this thesis studies the computation of such measurements when they are obtained with stereo cameras and develops the appropriate noise propagation models for such case. Furthermore, the initial formulation of Pose SLAM assumes poses in SE(2) and in this thesis we extend this formulation to SE(3), parameterizing rotations either with Euler angles and quaternions. We also introduce a loop closure test that exploits the information from the filter using an independent measure of information content between poses. In the application domain, we present a technique to process the 3D volumetric maps obtained with this SLAM methodology, but with laser range scanning as the sensor modality, to derive traversability maps that were useful for the navigation of a heterogeneous fleet of mobile robots in the context of the EU project URUS.Aside from these extensions to Pose SLAM, the core contribution of the thesis is an approach for path planning that exploits the modeled uncertainties in Pose SLAM to search for the path in the pose graph with the lowest accumulated robot pose uncertainty, i.e., the path that allows the robot to navigate to a given goal with the least probability of becoming lost. An added advantage of the proposed path planning approach is that since Pose SLAM is agnostic with respect to the sensor modalities used, it can be used in different environments and with i different robots, and since the original pose graph may come from a previous mapping session, the paths stored in the map already satisfy constraints not easy modeled in the robot controller, such as the existence of restricted regions, or the right of way along paths. The proposed path planning methodology has been extensively tested both in simulation and with a real outdoor robot.Our path planning approach is adequate for scenarios where a robot is initially guided during map construction, but autonomous during execution. For other scenarios in which more autonomy is required, the robot should be able to explore the environment without any supervision. The second core contribution of this thesis is an autonomous exploration method that complements the aforementioned path planning strategy. The method selects the appropriate...

show abstract

Section: Bibliographical Notesmentioning

confidence: 99%

Mapping, Planning and Exploration with Pose SLAM

Valencia,

Andrade-Cetto

2018

Springer Tracts in Advanced Robotics

View full text Add to dashboard Cite

show abstract

“…Another approach proposed by Dallaire et al [2009] allows flexibility over the choice of transition function. Here the transition and reward functions are defined by:…”

Section: Extensions To Continuous Mdpsmentioning

confidence: 99%

Convex Optimization: Algorithms and Complexity

Ghavamzadeh

Mannor

Pineau

et al. 2015

FNT in Machine Learning

161

View full text Add to dashboard Cite

Bayesian methods for machine learning have been widely investigated, yielding principled methods for incorporating prior information into inference algorithms. In this survey, we provide an in-depth review of the role of Bayesian methods for the reinforcement learning (RL) paradigm. The major incentives for incorporating Bayesian reasoning in RL are: 1) it provides an elegant approach to action-selection (exploration/exploitation) as a function of the uncertainty in learning; and 2) it provides a machinery to incorporate prior knowledge into the algorithms. We first discuss models and methods for Bayesian inference in the simple single-step Bandit model. We then review the extensive recent literature on Bayesian methods for model-based RL, where prior information can be expressed on the parameters of the Markov model. We also present Bayesian methods for model-free RL, where priors are expressed over the value function or policy class. The objective of the paper is to provide a comprehensive survey on Bayesian RL algorithms and their theoretical and empirical properties.

show abstract

“…In contrast to previous work on continuous POMDPs (e.g., [41]), we are focused on large structured action spaces (i.e., all possible routings of the cooperative vehicles).…”

Section: Related Workmentioning

confidence: 99%

Non-Myopic Adaptive Route Planning in Uncertain Congestion Environments

Liu

Yue

Krishnan

2015

IEEE Trans. Knowl. Data Eng.

View full text Add to dashboard Cite

We consider the problem of adaptively routing a fleet of cooperative vehicles within a road network in the presence of uncertain and dynamic congestion conditions. To tackle this problem, we first propose a Gaussian Process Dynamic Congestion Model that can effectively characterize both the dynamics and the uncertainty of congestion conditions. Our model is efficient and thus facilitates real-time adaptive routing in the face of uncertainty. Using this congestion model, we develop efficient algorithms for nonmyopic adaptive routing to minimize the collective travel time of all vehicles in the system. A key property of our approach is the ability to efficiently reason about the long-term value of exploration, which enables collectively balancing the exploration/exploitation trade-off for entire fleets of vehicles. Our approach is validated by traffic data from two large Asian cities. Our congestion model is shown to be effective in modeling dynamic congestion conditions. Our routing algorithms also generate significantly faster routes compared to standard baselines, and achieve near-optimal performance compared to an omniscient routing algorithm. We also present the results from a preliminary field study, which showcases the efficacy of our approach.

show abstract

Bayesian reinforcement learning in continuous POMDPs with gaussian processes

Cited by 26 publications

References 13 publications

Mapping, Planning and Exploration with Pose SLAM

Mapping, Planning and Exploration with Pose SLAM

Convex Optimization: Algorithms and Complexity

Non-Myopic Adaptive Route Planning in Uncertain Congestion Environments

Contact Info

Product

Resources

About