2023
DOI: 10.48550/arxiv.2301.05630
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Decentralized model-free reinforcement learning in stochastic games with average-reward objective

Abstract: We propose the first model-free algorithm that achieves low regret performance for decentralized learning in two-player zerosum tabular stochastic games with infinite-horizon average-reward objective. In decentralized learning, the learning agent controls only one player and tries to achieve low regret performances against an arbitrary opponent. This contrasts with centralized learning where the agent tries to approximate the Nash equilibrium by controlling both players. In our infinite-horizon undiscounted se… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 5 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?