Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems - AAMAS '03 2003
DOI: 10.1145/860685.860687
|View full text |Cite
|
Sign up to set email alerts
|

A selection-mutation model for q-learning in multi-agent systems

Abstract: Although well understood in the single-agent framework, the use of traditional reinforcement learning (RL) algorithms in multi-agent systems (MAS) is not always justified. The feedback an agent experiences in a MAS, is usually influenced by the other agents present in the system. Multi agent environments are therefore non-stationary and convergence and optimality guarantees of RL algorithms are lost. To better understand the dynamics of traditional RL algorithms we analyze the learning process in terms of evol… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
86
0

Year Published

2003
2003
2019
2019

Publication Types

Select...
5
4
1

Relationship

5
5

Authors

Journals

citations
Cited by 36 publications
(86 citation statements)
references
References 2 publications
0
86
0
Order By: Relevance
“…This represents the exploration of action by individual i (it is an analogue of mutation in evolutionary biology), and brings the dynamics back into the interior of the state space if it gets too close to the boundary. The second term in the brackets takes the same form as the replicator equation (Hofbauer and Sigmund, 1998;Tuyls et al, 2003); that is, if the expected reinforcement,R i (a), to action a is higher than the average expected reinforcement,…”
Section: Differential Equations For Action Play Probabilitiesmentioning
confidence: 99%
“…This represents the exploration of action by individual i (it is an analogue of mutation in evolutionary biology), and brings the dynamics back into the interior of the state space if it gets too close to the boundary. The second term in the brackets takes the same form as the replicator equation (Hofbauer and Sigmund, 1998;Tuyls et al, 2003); that is, if the expected reinforcement,R i (a), to action a is higher than the average expected reinforcement,…”
Section: Differential Equations For Action Play Probabilitiesmentioning
confidence: 99%
“…This system (without the exploration term) is known as bi-matrix replicator equation [20,21]. Its relation to multi-agent learning has been examined in [6,8,[22][23][24]. Before proceeding further, we elaborate on the connection between the rest-points of the replicator system Eqs.…”
Section: B Two-agent Learningmentioning
confidence: 99%
“…Namely, if one associates a particular biological trait with each pure strategy, then the adaptive learning of (possibly mixed) strategies in multi-agent settings is analogous to competitive dynamics of mixed population, where the species evolve according to their relative fitness in the population. This framework has been used successfully to study various interesting features of adaptive dynamics of learning agents [7,16,19,20,[24][25][26].…”
Section: Introductionmentioning
confidence: 99%