2022
DOI: 10.48550/arxiv.2206.00159
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Provably Efficient Offline Multi-agent Reinforcement Learning via Strategy-wise Bonus

Abstract: This paper considers offline multi-agent reinforcement learning. We propose the strategy-wise concentration principle which directly builds a confidence interval for the joint strategy, in contrast to the point-wise concentration principle that builds a confidence interval for each point in the joint action space. For two-player zero-sum Markov games, by exploiting the convexity of the strategy-wise bonus, we propose a computationally efficient algorithm whose sample complexity enjoys a better dependency on th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 22 publications
(37 reference statements)
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?