2005
DOI: 10.1007/978-3-540-32274-0_7
|View full text |Cite
|
Sign up to set email alerts
|

Learning to Coordinate Using Commitment Sequences in Cooperative Multi-agent Systems

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2005
2005
2016
2016

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 1 publication
0
4
0
Order By: Relevance
“…However, in games with stochastic rewards the heuristic may perform poorly. Therefore the authors propose the use of commitment sequences in [10] to allow for stochastic rewards. In a way these commitment sequences are very similar to the exploration phases of ESRL learning, however the former approach strongly depends on synchronous action-selection, whereas ESRL is suited for asynchronous real-life applications as shown in Sect.…”
Section: Related Workmentioning
confidence: 99%
“…However, in games with stochastic rewards the heuristic may perform poorly. Therefore the authors propose the use of commitment sequences in [10] to allow for stochastic rewards. In a way these commitment sequences are very similar to the exploration phases of ESRL learning, however the former approach strongly depends on synchronous action-selection, whereas ESRL is suited for asynchronous real-life applications as shown in Sect.…”
Section: Related Workmentioning
confidence: 99%
“…First, we apply the algorithms to the cooperative matrix games presented in Figures 1 and 2 (pages 4, 6). These games have received considerable attention from the community (Claus & Boutilier, 1998;Lauer & Riedmiller, 2004;Carpenter & Kudenko, 2005;Kapetanakis et al, 2005;McGlohon & Sen, 2005;Verbeeck et al, 2007) as they remain challenging for state-of-the-art algorithms despite their apparent simplicity. Indeed, even a simple game with two players and few actions can be challenging if agents are unaware of the game and independent (Abdallah & Lesser, 2008).…”
Section: Penalty and Climbing Gamesmentioning
confidence: 99%
“…Only convergence to (possibly suboptimal) Nash equilibria can be guaranteed (Claus and Boutilier, 1998). Games such as the climbing game (Kapetanakis et al, 2003), shown in Figure 2 are generally accepted as hard coordination problems for independent learners 4 . In this game, independent learners may get stuck in the suboptimal Nash equilibrium (a 12 ,a 22 ) with payoff 7.…”
Section: Exploring Selfish Reinforcement Learnersmentioning
confidence: 99%