2019
DOI: 10.1109/tsmc.2018.2861826
|View full text |Cite
|
Sign up to set email alerts
|

Approximate Nash Solutions for Multiplayer Mixed-Zero-Sum Game With Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
23
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 68 publications
(27 citation statements)
references
References 40 publications
0
23
0
Order By: Relevance
“…Most related results are H ∞ control of multi-agent and multi-player systems [6]- [9]. In [7] agents have their individual dynamics and anti-interference problem has been investigated for continuous-time multi-player systems in [6], [8], [9]. The main difference of multi-player systems from multi-agent systems is all players in multi-player systems are capable of accessing the state of the overall systems, which makes us try to find a different manipulation from multi-agent systems to design H ∞ controllers.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Most related results are H ∞ control of multi-agent and multi-player systems [6]- [9]. In [7] agents have their individual dynamics and anti-interference problem has been investigated for continuous-time multi-player systems in [6], [8], [9]. The main difference of multi-player systems from multi-agent systems is all players in multi-player systems are capable of accessing the state of the overall systems, which makes us try to find a different manipulation from multi-agent systems to design H ∞ controllers.…”
Section: Introductionmentioning
confidence: 99%
“…The main difference of multi-player systems from multi-agent systems is all players in multi-player systems are capable of accessing the state of the overall systems, which makes us try to find a different manipulation from multi-agent systems to design H ∞ controllers. Like [6], [8], [9], the model-free H ∞ controller design will be taken into account for multi-player systems in this paper, while the difference of nature of discrete-time sampling from continues-time processes makes it more complicated to solve H ∞ control problem from the discrete-time system perspective, and multiple players and completely unknown dynamics of players increase this difficulty. Moreover, in view of the advantages of off-policy learning over on-policy learning shown in our previous result [37] wherein the off -policy Q-learning method was proposed for multi-player systems without the consideration of disturbance, developing an off-policy game Q-learning algorithm to solve H ∞ control problem for discrete-time linear multi-player systems using only measured data becomes our target.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…T HE theories with respect to differential games have received more and more attentions since it was firstly studied in [1]. With the efforts of worldwide scholars, differential games theories, which are closely linked to our daily life nowadays, have been widely used in economics, sociology and many other domains [2]- [7]. Three indispensable elements, i.e., players, control policies and performance functions, jointly build the footstone of differential games theory.…”
Section: Introductionmentioning
confidence: 99%
“…Adaptive dynamic programming is an integration of adaptive control, dynamic programming, and reinforcement learning [14,15]. By using an "actor-critic" structure, it can approximate the solution of the Hamilton-Jacobi-Bellman (HJB) equation online without knowing the full knowledge of the system model, or even when the system model is unknown [16,17]. Thus, if an initial admissible control policy is available, an approximated optimal control can be obtained by solving the HJB equation iteratively.…”
Section: Introductionmentioning
confidence: 99%