2022
DOI: 10.48550/arxiv.2206.01880
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Learning in Congestion Games with Bandit Feedback

Abstract: Learning Nash equilibria is a central problem in multi-agent systems. In this paper, we investigate congestion games, a class of games with benign theoretical structure and broad real-world applications. We first propose a centralized algorithm based on the optimism in the face of uncertainty principle for congestion games with (semi-)bandit feedback, and obtain finite-sample guarantees. Then we propose a decentralized algorithm via a novel combination of the Frank-Wolfe method and G-optimal design. By exploit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 17 publications
0
3
0
Order By: Relevance
“…Additionally, Dou et al (2022) provide sample complexity results for the centralized and decentralized problem settings under both semi-bandit and bandit feedback. Similarly, Cui et al (2022) study the regret of the Nash Q-learning algorithm for two-player turn based Markov games, and introduce the first gap dependent logarithmic upper bounds, which match theoretical lower bounds up to log factors, in the episodic tabular setting.…”
Section: Related Workmentioning
confidence: 99%
“…Additionally, Dou et al (2022) provide sample complexity results for the centralized and decentralized problem settings under both semi-bandit and bandit feedback. Similarly, Cui et al (2022) study the regret of the Nash Q-learning algorithm for two-player turn based Markov games, and introduce the first gap dependent logarithmic upper bounds, which match theoretical lower bounds up to log factors, in the episodic tabular setting.…”
Section: Related Workmentioning
confidence: 99%
“…In this part we will show the Nash-UCB algorithm proposed in Cui et al [2022] satisfies Assumption 1. We carry out the proof by pointing out the modifications we need to make in their proof.…”
Section: B3 Proof For Theoremmentioning
confidence: 99%
“…This versatile approach 1.2 Related Work (Stationary) Multi-agent reinforcement learning. Numerous works have been devoted to learning equilibria in (stationary) multi-agent systems, including zero-sum Markov games [Bai et al, 2020, general-sum Markov games , Mao et al, 2022, Song et al, 2021, Daskalakis et al, 2022, Wang et al, 2023, Cui et al, 2023, Markov potential games [Leonardos et al, 2021, Song et al, 2021, Ding et al, 2022, Cui et al, 2023, congestion games [Cui et al, 2022], extensive-form games [Kozuno et al, 2021, and partially observable Markov games [Liu et al, 2022]. These works aim to learn equilibria with bandit feedback efficiently, measured by either regret or sample complexity.…”
Section: Introductionmentioning
confidence: 99%