2023
DOI: 10.48550/arxiv.2301.12942
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Refined Regret for Adversarial MDPs with Linear Function Approximation

Abstract: We consider learning in an adversarial Markov Decision Process (MDP) where the loss functions can change arbitrarily over K episodes and the state space can be arbitrarily large. We assume that the Q-function of any policy is linear in some known features, that is, a linear function approximation exists. The best existing regret upper bound for this setting (Luo et al., 2021b) is of order O(K 2/3 ) (omitting all other dependencies), given access to a simulator. This paper provides two algorithms that improve t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(5 citation statements)
references
References 3 publications
0
5
0
Order By: Relevance
“…Similar to previous works on linear adversarial MDP [Luo et al, 2021a,b;Sherman et al, 2023;Dai et al, 2023] that discuss the cases of whether a transition simulator is available, we define the simulator that may help in the following assumption. Note that this simulator is weaker than Luo et al [2021a,b]; Dai et al [2023] because their simulator could generate a next state given any state-action pair.…”
Section: Discussionmentioning
confidence: 99%
See 4 more Smart Citations
“…Similar to previous works on linear adversarial MDP [Luo et al, 2021a,b;Sherman et al, 2023;Dai et al, 2023] that discuss the cases of whether a transition simulator is available, we define the simulator that may help in the following assumption. Note that this simulator is weaker than Luo et al [2021a,b]; Dai et al [2023] because their simulator could generate a next state given any state-action pair.…”
Section: Discussionmentioning
confidence: 99%
“…We could vary this construction procedure by further putting a finite action covering over the action space to deal with the infinite action space setting. More details can be found in Appendix C. Meanwhile, Neu and Olkhovskaya [2021] requires both state and action spaces to be finite and Luo et al [2021a,b]; Dai et al [2023]; Sherman et al [2023] can only deal with finite action space.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations