2022
DOI: 10.48550/arxiv.2206.00113
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

BRExIt: On Opponent Modelling in Expert Iteration

Abstract: Finding a best response policy is a central objective in game theory and multi-agent learning, with modern population-based training approaches employing reinforcement learning algorithms as best-response oracles to improve play against candidate opponents (typically previously learnt policies). We propose Best Response Expert Iteration (BRExIt), which accelerates learning in games by incorporating opponent models into the state-of-the-art learning algorithm Expert Iteration (ExIt). BRExIt aims to (1) improve … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 16 publications
(34 reference statements)
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?