1973
DOI: 10.1287/opre.21.5.1071
|View full text |Cite
|
Sign up to set email alerts
|

The Optimal Control of Partially Observable Markov Processes over a Finite Horizon

Abstract: This paper formulates the optimal control problem for a class of mathematical models in which the system to be controlled is characterized by a finite-state discrete-time Markov process. The states of this internal process are not directly observable by the controller; rather, he has available a set of observable outputs that are only probabilistically related to the internal state of the system. The formulation is illustrated by a simple machine-maintenance example, and other specific application areas are al… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
708
0
5

Year Published

1977
1977
2013
2013

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 1,091 publications
(732 citation statements)
references
References 4 publications
0
708
0
5
Order By: Relevance
“…24 and demonstrate the empirical performance of the model clustering approach on two problem domains: the multiagent tiger problem (tiger's location resets if a door is opened) and a multiagent version of the machine maintenance problem [34], both of which are described in the Appendix. In particular, we show that the quality of the policies generated using our method approaches that of the exact policy as K increases.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…24 and demonstrate the empirical performance of the model clustering approach on two problem domains: the multiagent tiger problem (tiger's location resets if a door is opened) and a multiagent version of the machine maintenance problem [34], both of which are described in the Appendix. In particular, we show that the quality of the policies generated using our method approaches that of the exact policy as K increases.…”
Section: Resultsmentioning
confidence: 99%
“…They generalize POMDPs [19,34] to multiagent settings by including other agents' computable models in the state space along with the states of the physical environment. The models encompass all information inf uencing the agents' behaviors, including their preferences, capabilities, and beliefs, and are thus analogous to types in Bayesian games as f rst envisioned by Harsanyi [17].…”
Section: Introductionmentioning
confidence: 99%
“…The Markov property does not hold in these circumstances. In such problems, it can be shown (Lovejoy, 1991;Sondik, 1978;Smallwood and Sondik, 1973;Monahan, 1982;White III and Scherer, 1989;Lin et al, 2004) that it is sufficient to keep a probability distribution called the belief state distribution of the current state. The most challenging aspect of this problem is that the state space becomes continuous, and consequently it is difficult to solve the problem exactly even for a handful of states.…”
Section: Semi-markov Decision Problemsmentioning
confidence: 99%
“…This has been done in [9], where efficient algorithms are also proposed for solving the dynamic programming equations.…”
Section: Theorem 1 [1]mentioning
confidence: 99%