2019
DOI: 10.1007/s10458-019-09411-3
|View full text |Cite
|
Sign up to set email alerts
|

SA-IGA: a multiagent reinforcement learning method towards socially optimal outcomes

Abstract: In multiagent environments, the capability of learning is important for an agent to behave appropriately in face of unknown opponents and dynamic environment. From the system designer's perspective, it is desirable if the agents can learn to coordinate towards socially optimal outcomes, while also avoiding being exploited by selfish opponents. To this end, we propose a novel gradient ascent based algorithm (SA-IGA) which augments the basic gradient-ascent algorithm by incorporating social awareness into the po… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
11
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 19 publications
(11 citation statements)
references
References 28 publications
0
11
0
Order By: Relevance
“…The second layer contributes details of how agents' characteristics determine the degree of empathy and make it possible to carry out an adaptive design for the decision algorithm. Given p ij = f (x), works in [24], [36] take into account the factors of changing empathy, such as social comparison and companion impression, and define f (•) in an incremental form. There are also some works that give the explicit function directly.…”
Section: Artificial Empathymentioning
confidence: 99%
See 1 more Smart Citation
“…The second layer contributes details of how agents' characteristics determine the degree of empathy and make it possible to carry out an adaptive design for the decision algorithm. Given p ij = f (x), works in [24], [36] take into account the factors of changing empathy, such as social comparison and companion impression, and define f (•) in an incremental form. There are also some works that give the explicit function directly.…”
Section: Artificial Empathymentioning
confidence: 99%
“…It mainly discusses how to make group decisions to maximize social welfare in the known network of empathy weights. The work in [24] also aimed to maximize social welfare, but the difference is that the proposed algorithm is designed in a distributed learning form. As a whole, most articles do not study how the empathy model maps individual characteristics to the degree of empathy and remain focus on how to get rid of the economic dilemma and maximize social welfare under the action of altruism.…”
Section: Introductionmentioning
confidence: 99%
“…Banerjee and Peng [21] referred policy dynamics based win or learn fast (PDWoLF) for some bimatrix game and general sum stochastic game. Zhang et al [22] studied the social-aware IGA (IGA-SA) by introducing social awareness into the strategy update process, if agents adopt rational learning strategies in the context of a repeated game and their strategies converge to the socially optimal outcomes of symmetric bimatrix games. In recent years, many scholars have designed different algorithms to analyze the power control game.…”
Section: Introductionmentioning
confidence: 99%
“…The goal of MARL is determined by the type of task. The goal in a general-sum game is to converge to some type of equilibrium [7] or socially optimal outcomes [8], [9]. This type of learning is known as equilibrium-based MARL (EMARL).…”
Section: Introductionmentioning
confidence: 99%