Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence 2023
DOI: 10.24963/ijcai.2023/16
|View full text |Cite
|
Sign up to set email alerts
|

Beyond Strict Competition: Approximate Convergence of Multi-agent Q-Learning Dynamics

Abstract: The behaviour of multi-agent learning in competitive settings is often considered under the restrictive assumption of a zero-sum game. Only under this strict requirement is the behaviour of learning well understood; beyond this, learning dynamics can often display non-convergent behaviours which prevent fixed-point analysis. Nonetheless, many relevant competitive games do not satisfy the zero-sum assumption. Motivated by this, we study a smooth variant of Q-Learning, a popular reinforcement learning dynamics … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
5
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(6 citation statements)
references
References 0 publications
1
5
0
Order By: Relevance
“…so that each agent has N 0 = N − 1 neighbours. This corresponds also to the case analysed by (Sanders, Farmer, and Galla 2018) and (Hussain, Belardinelli, and Piliouras 2023) in which it was predicted that the boundary between stable and unstable learning dynamics is impacted by the total number of agents.…”
Section: Methodssupporting
confidence: 57%
See 4 more Smart Citations
“…so that each agent has N 0 = N − 1 neighbours. This corresponds also to the case analysed by (Sanders, Farmer, and Galla 2018) and (Hussain, Belardinelli, and Piliouras 2023) in which it was predicted that the boundary between stable and unstable learning dynamics is impacted by the total number of agents.…”
Section: Methodssupporting
confidence: 57%
“…In particular, the region in which learning converges to fixed point seems to vanish as the number of agents increases. This result is supported by that of (Hussain, Belardinelli, and Piliouras 2023) in which a lower bound on exploration rates was determined so that Q-Learning dynamics converge to a unique equilibrium. Again it was shown that this lower bound increases with the number of agents.…”
Section: Model and Contributionsmentioning
confidence: 52%
See 3 more Smart Citations