2006
DOI: 10.1016/j.crma.2006.07.011
|View full text |Cite
|
Sign up to set email alerts
|

A policy iteration algorithm for zero-sum stochastic games with mean payoff

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
23
0

Year Published

2009
2009
2018
2018

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 12 publications
(23 citation statements)
references
References 15 publications
0
23
0
Order By: Relevance
“…This difficulty was solved first in the deterministic framework in [CTGG99], where it was shown that cycling can be avoided by enforcing a special choice of the bias vector, obtained by a nonlinear projection operation. This approach was then extended to the stochastic framework in [CTG06,ACTDG12]. As a special case of these results, we get that policy iteration is correct and does terminate under much milder conditions than in Theorem 4.1.…”
Section: Theorem 41 (Corollary Of [Hk66]) Algorithm 1 Terminates Anmentioning
confidence: 83%
“…This difficulty was solved first in the deterministic framework in [CTGG99], where it was shown that cycling can be avoided by enforcing a special choice of the bias vector, obtained by a nonlinear projection operation. This approach was then extended to the stochastic framework in [CTG06,ACTDG12]. As a special case of these results, we get that policy iteration is correct and does terminate under much milder conditions than in Theorem 4.1.…”
Section: Theorem 41 (Corollary Of [Hk66]) Algorithm 1 Terminates Anmentioning
confidence: 83%
“…Then, using the uniqueness of bias vectors, we deduce from Lemma 3.3 of [17], that if λ k+1 = λ k , then up to an additive constant, v k+1 ≤ v k with equality on the set of critical nodes of T σ k r . This implies that the sequence v k coincides up to an additive constant with the sequence obtained in the algorithm introduced in [19] and developed in [20]. Then, the convergence of Algorithm 21 in a finite number of steps follows from the convergence of the algorithm of [19], [20].…”
Section: Sketch Of Proofmentioning
confidence: 99%
“…This implies that the sequence v k coincides up to an additive constant with the sequence obtained in the algorithm introduced in [19] and developed in [20]. Then, the convergence of Algorithm 21 in a finite number of steps follows from the convergence of the algorithm of [19], [20].…”
Section: Sketch Of Proofmentioning
confidence: 99%
“…U g/λ n g/λ n g/λ n kh V λ n +2 /k 2 g = hl/kλ m 1λ n −2 /k 2 g = λ n +2 /k 2 g λ n +2 /k 2 g e/k 2kh X e e e e Y l/k l/k λ 2+n −m h /k 2 g = λ 4 l/1k e/k 2k (11) where the eigenvalue λ of the minplus nonlinear system (10) has been given in Theorem 9…”
Section: Appendixmentioning
confidence: 99%