2009
DOI: 10.1007/s10514-009-9130-2
|View full text |Cite
|
Sign up to set email alerts
|

A Bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot

Abstract: We address the problem of online path planning for optimal sensing with a mobile robot. The objective of the robot is to learn the most about its pose and the environment given time constraints. We use a POMDP with a utility function that depends on the belief state to model the finite horizon planning problem. We replan as the robot progresses throughout the environment. The POMDP is high-dimensional, continuous, non-differentiable, nonlinear, non-Gaussian and must be solved in real-time. Most existing techni… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
141
0
1

Year Published

2009
2009
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 187 publications
(142 citation statements)
references
References 40 publications
0
141
0
1
Order By: Relevance
“…For the same sampling budget, Kriging interpolation is far more sharply with no additional parameter to tune. For each action a t , the expected return could be estimated by averaging the return of a set of trajectories generated according to the pose uncertainty distribution, as in [7]. The return of a single trajectory is considered here, since no pose uncertainty is assumed during the learning stage.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…For the same sampling budget, Kriging interpolation is far more sharply with no additional parameter to tune. For each action a t , the expected return could be estimated by averaging the return of a set of trajectories generated according to the pose uncertainty distribution, as in [7]. The return of a single trajectory is considered here, since no pose uncertainty is assumed during the learning stage.…”
Section: Resultsmentioning
confidence: 99%
“…The objective is to find the best approximation of the action-value function within a small sampling budget. In [7], Kriging and Bayesian optimization have been used to address a simultaneous localization and mapping problem under time and energy constraints. In the present paper, similar strategies are investigated to tackle the problem of viewpoint planning for active recognition.…”
Section: Introductionmentioning
confidence: 99%
“…In [40], the authors use this kind of technique to optimize motion of a robot in order to minimize uncertainty of localization. They predict the information gain and the cost using the prior and so dynamically solve the exploration/exploitation dilemma in the context of path planning.…”
Section: Active Exploration While Learningmentioning
confidence: 99%
“…Reinforcement Learning (RL) is a notable paradigm for robot control with several reported successes in recent years (Tedrake et al, 2005;Abbeel et al, 2007;Kober and Peters, 2009;Riedmiller et al, 2009;Martinez-Cantin et al, 2009). In this paper we address the problem of model-free RL, in which the goal is to learn a controller without knowledge of the dynamics of the system (Bertsekas and Tsitsiklis, 1996;Sutton and Barto, 1998).…”
Section: Introductionmentioning
confidence: 99%