Robotics: Science and Systems III 2007
DOI: 10.15607/rss.2007.iii.041
|View full text |Cite
|
Sign up to set email alerts
|

Active Policy Learning for Robot Planning and Exploration under Uncertainty

Abstract: Abstract-This paper proposes a simulation-based active policy learning algorithm for finite-horizon, partially-observed sequential decision processes. The algorithm is tested in the domain of robot navigation and exploration under uncertainty, where the expected cost is a function of the belief state (filtering distribution). This filtering distribution is in turn nonlinear and subject to discontinuities, which arise because constraints in the robot motion and control models. As a result, the expected cost is … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
106
0

Year Published

2010
2010
2020
2020

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 113 publications
(107 citation statements)
references
References 28 publications
(32 reference statements)
1
106
0
Order By: Relevance
“…This idea has been pursued in some works [67,[94][95][96]106]. Huang et al [67] introduced a discussion about the problem of multi-step look-ahead exploration in the context of SLAM, arguing that multi-step active SLAM is possible when the current estimation error is small, the probability of observing new feature is low, and the computation capability is high.…”
Section: Action Selectionmentioning
confidence: 99%
See 1 more Smart Citation
“…This idea has been pursued in some works [67,[94][95][96]106]. Huang et al [67] introduced a discussion about the problem of multi-step look-ahead exploration in the context of SLAM, arguing that multi-step active SLAM is possible when the current estimation error is small, the probability of observing new feature is low, and the computation capability is high.…”
Section: Action Selectionmentioning
confidence: 99%
“…In the work presented in [105,106], Martinez Cantin et al proposed a reinforcement learning approach to solve the problem of exploration for SLAM, their technique is based on the work presented in [114]. They employ a direct policy search approach [122], where the value funtion is approximated using Gaussian Processes (GP).…”
Section: Action Selectionmentioning
confidence: 99%
“…Active learning differs however in that the aim is only to poll the user when the information returned is useful (above a threshold or according to some constrained budget). To the best of our knowledge is it novel in the area of thermal comfort modelling but has a long history and has been applied in many fields such as robot control [24], fault detection [25], as a general optimisation approach [26] amongst others [27][28][29]. 4 …”
Section: Related Workmentioning
confidence: 99%
“…Note that it is through ζ x that C1 and C2 may be considered in the model. The maximum a-posteriori estimates for the process parameters are [24]:…”
Section: Gaussian Process Modelsmentioning
confidence: 99%
“…In our case, the expected number of landmarks to see and a very rough uniform disposition of them in the environment are our initial conditions. Several authors make such assumption either with a priori grid-based discretization of the environment [14], [20] or by adding uniformly distributed unvisited landmarks as vague priors [21], [22].…”
Section: Action Selectionmentioning
confidence: 99%