2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton) 2017
DOI: 10.1109/allerton.2017.8262875
|View full text |Cite
|
Sign up to set email alerts
|

On exploiting spectral properties for solving MDP with large state space

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 15 publications
0
5
0
Order By: Relevance
“…Optimizing this function in the process of mapping an unknown environment, where the objective model and the time needed to build it are unknown, is still under research. Though the process of predicting the future impact of an action is computationally expensive, there are recent advancements by using spectral techniques [ 163 ] and deep learning [ 164 ].…”
Section: On Going Developmentsmentioning
confidence: 99%
“…Optimizing this function in the process of mapping an unknown environment, where the objective model and the time needed to build it are unknown, is still under research. Though the process of predicting the future impact of an action is computationally expensive, there are recent advancements by using spectral techniques [ 163 ] and deep learning [ 164 ].…”
Section: On Going Developmentsmentioning
confidence: 99%
“…In the first category, the system itself is approximated by a low-complexity system (e.g., smaller dimension), whereas an approximately optimal solution can be obtained. Methods in this category include bisimulation [13], [6], PCA analysis [26], [20], and information-theoretic compression such as the information bottleneck method [1], [17]. In the second category, a low-complexity policy is instead obtained directly.…”
Section: B Contributionmentioning
confidence: 99%
“…3 Result on the spectral value iteration algorithm Using the notation of [1], we are calling • P µ is the state transition matrix under policy µ, such that P µ ∞ = 1.…”
Section: Spectral Radiusmentioning
confidence: 99%
“…The article [1] introduces a method to generalize the value iteration algorithm, which becomes computationally unfeasible for MDPs with large state-space. This algorithm requires run the value iteration algorithm on a subspace of the state space that is chosen according to the spectral properties of the probability transition matrix of the process.…”
Section: Introductionmentioning
confidence: 99%