2000
DOI: 10.1080/02331930008844505
|View full text |Cite
|
Sign up to set email alerts
|

Constrained markov decision processes with compact state and action spaces: the average case

Abstract: Constrained Markov decision processes with compact state and action spaces are studied under long-run average reward or cost criteria. And introducing a corresponding Lagrange function, a saddle-point theorem is given, by which the existence of a constrained optimal pair of initial state distribution and policy is shown. Also, under the hypothesis of Doeblin, the functional characterization of a constrained optimal policy is obtained.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
24
0

Year Published

2002
2002
2020
2020

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 15 publications
(24 citation statements)
references
References 11 publications
0
24
0
Order By: Relevance
“…Further, such nonnegativeness assumption cannot be removed because it is required for the use of the standard weak convergence of probability measures. This in turn implies that the constrained optimality problem of minimizing nonnegative costs in [11,19,20] with constraints imposed on other nonnegative costs cannot be transformed to an equivalent optimality problem of maximizing bounded rewards as in [29] with constraints imposed on bounded costs. Hence, the constrained discrete and continuous time MDPs with Polish spaces, in which rewards (to be maximized) and costs (with constraints) may be unbounded from above and from below, have not been studied.On the other hand, as is known, continuous-time MDPs in Polish spaces have been studied in [11,12,16,27,34].…”
mentioning
confidence: 99%
“…Further, such nonnegativeness assumption cannot be removed because it is required for the use of the standard weak convergence of probability measures. This in turn implies that the constrained optimality problem of minimizing nonnegative costs in [11,19,20] with constraints imposed on other nonnegative costs cannot be transformed to an equivalent optimality problem of maximizing bounded rewards as in [29] with constraints imposed on bounded costs. Hence, the constrained discrete and continuous time MDPs with Polish spaces, in which rewards (to be maximized) and costs (with constraints) may be unbounded from above and from below, have not been studied.On the other hand, as is known, continuous-time MDPs in Polish spaces have been studied in [11,12,16,27,34].…”
mentioning
confidence: 99%
“…Additional results and generalizations of some of the results of [20] were given by Hernández-Lerma and González-Hernández [17]. Extensions of the LP framework to constrained MDPs were subsequently studied by Kurano et al [31] for compact spaces and by Hernández-Lerma et al [19] for non-compact spaces.…”
Section: Introductionmentioning
confidence: 94%
“…Our results for unconstrained (resp., constrained) MDPs given in this paper can be compared with some of the prior results in [22,Chap. 12] and [23] (resp., [19] and [31]) for lower-semicontinuous models.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, Kurano et al [3] derived a saddle-point theorem for constrained MDPs with average reward criteria. For the utility treatment for MDPs and constrained MDPs, refer to [1,2,[4][5][6][7] and their references.…”
Section: Introduction and Problem Formulationmentioning
confidence: 99%