2020
DOI: 10.1109/access.2019.2960064
|View full text |Cite
|
Sign up to set email alerts
|

Data-Driven Nonzero-Sum Game for Discrete-Time Systems Using Off-Policy Reinforcement Learning

Abstract: In this paper, we develop a data-driven algorithm to learn the Nash equilibrium solution for a two-player non-zero-sum (NZS) game with completely unknown linear discrete-time dynamics based on offpolicy reinforcement learning (RL). This algorithm solves the coupled algebraic Riccati equations (CARE) forward in time in a model-free manner by using the online measured data. We first derive the CARE for solving the two-player NZS game. Then, model-free off-policy RL is developed to obviate the requirement of comp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
11
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(11 citation statements)
references
References 53 publications
0
11
0
Order By: Relevance
“…Various RL algorithms [15][16][17][18] exhibit diverse computational complexities, leading to varying demands for computational power. Therefore, it becomes imperative to study the method that can reduce the computational power requirements of algorithms without compromising their performance.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Various RL algorithms [15][16][17][18] exhibit diverse computational complexities, leading to varying demands for computational power. Therefore, it becomes imperative to study the method that can reduce the computational power requirements of algorithms without compromising their performance.…”
Section: Introductionmentioning
confidence: 99%
“…The third type is the actor-critic (AC) structure, which is also the most widely used. For example, [16][17][18] respectively solved the NZS problem of different controlled systems. There is also a synchronous RL method [28,29] based on the AC structure, which can continuously and simultaneously adjust the weights of actor NN and critic NN.…”
Section: Introductionmentioning
confidence: 99%
“…Many practical application scenarios can be modeled as a multi‐input system which is controlled by multiple controllers 1,2 . From the perspective of game theory, 3‐5 the study of optimal control problems for multi‐control input systems has become a hotspot in control theory research 6‐10 . Based on the different roles and tasks of each control input, optimal control problem of multi‐input system can be divided into: fully cooperative games (FC), 11 zero‐sum (ZS) games, 12 and nonzero‐sum (NZS) games 13 .…”
Section: Introductionmentioning
confidence: 99%
“…1,2 From the perspective of game theory, [3][4][5] the study of optimal control problems for multi-control input systems has become a hotspot in control theory research. [6][7][8][9][10] Based on the different roles and tasks of each control input, optimal control problem of multi-input system can be divided into: fully cooperative games (FC), 11 zero-sum (ZS) games, 12 and nonzero-sum (NZS) games. 13 In fact, FC games can be regarded as a special case of the NZS games.…”
Section: Introductionmentioning
confidence: 99%
“…Based on the roles and tasks of inputs, the optimal control of multiple control input systems can be studied from three perspectives: zero-sum game (ZS), non-zero-sum (NZS) game and fully cooperative (FC) game [40]. For zero-sum games [41], [42] [43], [44] and non-zero-sum games [45], [46] [47], [48] [49], [50] [51], [52], scholars have developed many ADP methods. How, there are few ADP studies on fully cooperative games [53], [54].…”
Section: Introductionmentioning
confidence: 99%