2011
DOI: 10.1287/moor.1110.0516
|View full text |Cite
|
Sign up to set email alerts
|

The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

Abstract: We prove that the classic policy-iteration method (Howard 1960), including the Simplex method (Dantzig 1947) with the most-negative-reduced-cost pivoting rule, is a strongly polynomial-time algorithm for solving the Markov decision problem (MDP) with a fixed discount rate. Furthermore, the computational complexity of the policyiteration method (including the Simplex method) is superior to that of the only known strongly polynomial-time interior-point algorithm ([28] 2005) for solving this problem. The result… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

6
170
0

Year Published

2013
2013
2022
2022

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 144 publications
(176 citation statements)
references
References 25 publications
(30 reference statements)
6
170
0
Order By: Relevance
“…On another front, the interesting polynomial simplex-like algorithm of Kelner and Spielman [20] does not settle Conjecture 2 because it is weakly polynomial, as the complexity of each iteration, and the number of iterations, depends (polynomially of course) on the bits of the integers in the input. Other recent results related to Conjecture 1 can be found in [6,35].…”
Section: Conjecture 1 There Is a Strongly Polynomial Algorithm For Lmentioning
confidence: 83%
“…On another front, the interesting polynomial simplex-like algorithm of Kelner and Spielman [20] does not settle Conjecture 2 because it is weakly polynomial, as the complexity of each iteration, and the number of iterations, depends (polynomially of course) on the bits of the integers in the input. Other recent results related to Conjecture 1 can be found in [6,35].…”
Section: Conjecture 1 There Is a Strongly Polynomial Algorithm For Lmentioning
confidence: 83%
“…By Lemma 4.12, we know thatδ(s, a, N (u k )) → 0 as k → ∞. We will also show that γ u k ,N (u k ) (s, a) becomes nonpositive as k → ∞, which will contradict (25), and thus, we will conclude thatȳ is feasible to (P).…”
Section: Proofmentioning
confidence: 67%
“…Recently, complexity of the simplex method with the Dantzig's pivoting rule for finite-state MDPs was studied in [25]. If one can derive a number of iterations (or computational complexity) for the simplex algorithm for countable-state MDPs to find a policy whose value function is within a given threshold from the optimal value function, then it would be possible to compare the convergence rates of the algorithms for countable-state MDPs by comparing the result for the simplex algorithm to the ones in [23,21].…”
Section: Discussion and Future Researchmentioning
confidence: 99%
See 1 more Smart Citation
“…It should be remarked that the diameter of the resulting polytopes is actually smaller than the Hirsch bound. Curiously, Ye (2011) showed that the simplex method using Dantzig's pivot rule (where one chooses the entering variable with the largest reduced cost coefficient). is strongly polynomial for the linear programs derived from Markov Decision Processes with Fixed Discount (which is not the setting for the other papers, but is an important case of MDPs).…”
Section: Comment 2: We Still Need To Work Harder To Understand the Gementioning
confidence: 99%