2022
DOI: 10.48550/arxiv.2211.03011
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

On learning history based policies for controlling Markov decision processes

Abstract: Reinforcement learning (RL) folklore suggests that history-based function approximation methods, such as recurrent neural nets or history-based state abstraction, perform be er than their memory-less counterparts, due to the fact that function approximation in Markov decision processes (MDP) can be viewed as inducing a Partially observable MDP. However, there has been li le formal analysis of such history-based algorithms, as most existing frameworks focus exclusively on memory-less features. In this paper, we… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 34 publications
(45 reference statements)
0
1
0
Order By: Relevance
“…In the stochastic formulation, the notion of approximate information states was presented in [81] to address the challenges of control and learning with partial observations. Approximate information states can improve the computational tractability of control problems with large state spaces at the cost of a bounded loss in performance [82]. The explicit performance bounds of a finite-memory based approximate information state were derived in [83].…”
Section: Introductionmentioning
confidence: 99%
“…In the stochastic formulation, the notion of approximate information states was presented in [81] to address the challenges of control and learning with partial observations. Approximate information states can improve the computational tractability of control problems with large state spaces at the cost of a bounded loss in performance [82]. The explicit performance bounds of a finite-memory based approximate information state were derived in [83].…”
Section: Introductionmentioning
confidence: 99%