2022
DOI: 10.1109/tnnls.2021.3069728
|View full text |Cite
|
Sign up to set email alerts
|

Reinforcement Learning-Based Cooperative Optimal Output Regulation via Distributed Adaptive Internal Model

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 57 publications
(14 citation statements)
references
References 51 publications
0
14
0
Order By: Relevance
“…With the increasing accessibility of low-cost, high-performance computing technology, DRL has been effectively applied to various areas. With the help of neural networks as function approximators, DRL can handle large dimensions of state or action space [ 25 , 26 , 27 , 28 , 29 , 30 ], which is the case with autonomous vehicle platoons [ 31 ]. Using a model-free DRL algorithm eliminates the need to model the environment’s complicated dynamics (the transition function/probability distribution).…”
Section: Literature Reviewmentioning
confidence: 99%
See 1 more Smart Citation
“…With the increasing accessibility of low-cost, high-performance computing technology, DRL has been effectively applied to various areas. With the help of neural networks as function approximators, DRL can handle large dimensions of state or action space [ 25 , 26 , 27 , 28 , 29 , 30 ], which is the case with autonomous vehicle platoons [ 31 ]. Using a model-free DRL algorithm eliminates the need to model the environment’s complicated dynamics (the transition function/probability distribution).…”
Section: Literature Reviewmentioning
confidence: 99%
“…An RL algorithm can be formed as a Markov decision process (MDP) [ 25 , 28 , 29 , 30 ], a statistical technique that samples from a complicated distribution and estimates its characteristics. MDP is used to choose the appropriate action given a complete set of observations [ 52 ].…”
Section: Preliminary Study On Reinforcement Learningmentioning
confidence: 99%
“…These methods do not require complete knowledge of the system model; rather, they use information about the state, input, and output of each agent to establish feasibility of solutions and guarantees on convergence of reinforcement learning-based algorithms. Input and state data were used in an online manner to design a distributed control algorithm to solve a cooperative optimal output regulation problem in leader-follower systems in [30]. Information obtained from trajectories of each player were used in [31] to develop real-time solutions to multi-player games through the design of an actor-critic-based adaptive learning algorithm.…”
Section: Related Workmentioning
confidence: 99%
“…In [23], a model-free RL-based method is proposed to design a suboptimal adaptive controllers for linear continuoustime multi-agent systems, in which the leader's system matrix information is required to design the controller for each follower and the controller gains are dependent on the communication graph. Different from [23], [25] proposes an effective optimal algorithm for discrete-time multi-agent systems, where the leader's system matrix information is required in designing each follower's controller and the modulus of each eigenvalue of leader's system matrix is required to be equal to 1; moreover, the communication graph is required to be acylic, which means that the communication graph is a digraph with no loops, [26] proposes a RL-based algorithm to solve the linear continuous-time COORP without requiring the knowledege of followers' system models, while the knowledge of leader's system model is required for each follower, and the eigenvalues of each follower need to be simple with zero real parts. In [27], distributed observers and adaptive controllers are designed for each follower to solve the COORP of nonlinear continuous-time multi-agent systems with unity relative degree, in which the exosystem is required to be stable, and all the eigenvalues of each follower are required to be semi-simple with zero real parts.…”
Section: Introductionmentioning
confidence: 99%