2018
DOI: 10.1201/9781482273274
|View full text |Cite
|
Sign up to set email alerts
|

Self-Learning Control of Finite Markov Chains

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0
1

Year Published

2019
2019
2019
2019

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 64 publications
(16 citation statements)
references
References 0 publications
0
15
0
1
Order By: Relevance
“…Each iteration of formulas presented in Eq. (17) and Eq. (18) has a natural interpretation and involves three nonlinear equations, corresponding to evaluation of the three extraproximal operators.…”
Section: 3multi-period Portfolio Optimization Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Each iteration of formulas presented in Eq. (17) and Eq. (18) has a natural interpretation and involves three nonlinear equations, corresponding to evaluation of the three extraproximal operators.…”
Section: 3multi-period Portfolio Optimization Methodsmentioning
confidence: 99%
“…Let be a finite set consisting of states { 1 , … , }, ∈ ℕ, called the state space. A Stationary Markov chain [17,7] is a sequence of -valued random variables ( ), ∈ ℕ, satisfying the Markov condition:…”
Section: Homogeneous Markov Chains Modelmentioning
confidence: 99%
“…The assumption that the Markov chains are ergodic ensures that ij has a unique everywhere positive invariant distribution P n [31] and, for a finite S, it is equivalent to the existence of some N ∈ N such that ( ij ) n > 0…”
Section: Remarkmentioning
confidence: 99%
“…Markov decision processes involve a popular framework for sequential decision-making in a random dynamic environment [31]. At each time step, an agent observes the state of the system of interest and chooses an action.…”
Section: Basicsmentioning
confidence: 99%
“…Available RL algorithms are in no means adequate. Theoretical studies prove convergence in only a few narrow special cases (see [14], [8]). Practical experience indicates that they generally do not achieve the Bellman optimality condition (that is the globally optimal solution, see [1]).…”
Section: Introductionmentioning
confidence: 99%