Wiley Encyclopedia of Operations Research and Management Science 2011
DOI: 10.1002/9780470400531.eorms0646
|View full text |Cite
|
Sign up to set email alerts
|

Partially ObservableMDPs(POMDPS): Introduction and Examples

Abstract: A partially observable Markov decision process (POMDP) is a generalization of a Markov decision process where the states of the model are not completely observable by the decision maker. Noisy observations provide a belief regarding the underlying state, while the decision maker has some control over the progression of the model through the selection of actions. In this article, we introduce POMDPs and discuss the relationship between Markov models and POMDPs. A general POMDP formulation and a wide range of PO… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2014
2014
2023
2023

Publication Types

Select...
5

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 46 publications
0
5
0
Order By: Relevance
“…The overall goal of reinforcement learning is to achieve the maximum reward over time by learning a policy through which the agent can choose proper actions for given states 1 . Typical reinforcement learning algorithms include Markov decision processes and partially observed Markov decision processes 47,48 …”
Section: Machine Learningmentioning
confidence: 99%
“…The overall goal of reinforcement learning is to achieve the maximum reward over time by learning a policy through which the agent can choose proper actions for given states 1 . Typical reinforcement learning algorithms include Markov decision processes and partially observed Markov decision processes 47,48 …”
Section: Machine Learningmentioning
confidence: 99%
“…If Monte Carlo simulations are carried out enough times, then the distribution of the cumulative number of cycles N will be obtained. This distribution is finally parameterized as the GMD pattern, defined in Equation (31). In this case, the accumulative number of cycles in a year can be represented using the Gaussian distribution N ;½75, 823, 1149 2 .…”
Section: Bdlm Establishingmentioning
confidence: 99%
“…The observation parameters in continuous POMDPs are evolved from the probability matrix to the probability density functions. After the BDLMs are established, a stochastic observation value o t + 1 is obtained based on GMM in Equation (31). The observed belief state is updated by Equation (32), where parameter P DC in Equation ( 32) is given via Equation ( 5) and STRs in Table 2.…”
Section: Management Model Establishingmentioning
confidence: 99%
See 1 more Smart Citation
“…General POMDP models have found wide applicability in multistate maintenance optimization problems (cf. and references therein). Maintenance models pertaining to wind turbines in particular are fairly sparse with the exception of Byon et al who formulated a POMDP model to optimally maintain a wind turbine component whose degradation state evolves as a finite, discrete‐time Markov chain (DTMC).…”
Section: Introductionmentioning
confidence: 99%