2005
DOI: 10.1007/11564096_35
|View full text |Cite
|
Sign up to set email alerts
|

Model-Based Online Learning of POMDPs

Abstract: Abstract. Learning to act in an unknown partially observable domain is a difficult variant of the reinforcement learning paradigm. Research in the area has focused on model-free methods -methods that learn a policy without learning a model of the world. When sensor noise increases, model-free methods provide less accurate policies. The model-based approach -learning a POMDP model of the world, and computing an optimal policy for the learned model -may generate superior results in the presence of sensor noise, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
65
0
3

Year Published

2006
2006
2017
2017

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 72 publications
(68 citation statements)
references
References 5 publications
0
65
0
3
Order By: Relevance
“…On the other hand, online approaches reduce the complexity of the problem by planning online for only the current information state [17,18,19]. It considers only a small horizon of possible scenarios.…”
Section: Pomdpsmentioning
confidence: 99%
“…On the other hand, online approaches reduce the complexity of the problem by planning online for only the current information state [17,18,19]. It considers only a small horizon of possible scenarios.…”
Section: Pomdpsmentioning
confidence: 99%
“…With the above two ideas, U-Tree can be further improved by making use of POMDP belief state based value in place of the MDP Q-value iteration. A similar approach was taken by Guy Shani et al [31], in which they proposed an extension of McCallum's Utile Suffix Memory [32] that makes use of the sensor reliability statistics and a modified version of Perseus [12] point-based belief state value iteration. However, their statistical approach of obtaining the state observation probabilities does not seem to be justified.…”
mentioning
confidence: 92%
“…The model-based approach is an important branch of POMDP research (Sallans, 2000;Theocharous, 2002;Shani et al, 2005). In this approach, the environment is assumed to be unknown to the agent, and the agent learns the model of the environment through experience (i.e., the history of actions and observations).…”
Section: Introductionmentioning
confidence: 99%