2017
DOI: 10.1002/acs.2831
|View full text |Cite
|
Sign up to set email alerts
|

Open‐loop Stackelberg learning solution for hierarchical control problems

Abstract: This work presents a novel framework based on adaptive learning techniques to solve the continuous-time open-loop Stackelberg games. The method yields real-time approximations of the game value and convergence of the policies to the open-loop Stackelberg-equilibrium solution, while also guaranteeing asymptotic stability of the equilibrium point of the closed-loop system. It is implemented as a separate actor/critic parametric network approximator structure for every player and involves simultaneous continuous-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 19 publications
(8 citation statements)
references
References 39 publications
(48 reference statements)
0
8
0
Order By: Relevance
“…Theorem 2. Consider the leader dynamics ( 4)-( 5), and the adaptive observer ( 35)- (37). Let ̃ = − , ̃ = − be leaders' dynamics estimation errors and ̃ = ̂ − * and ̃ 0 ( ) = ̂ − * be th follower state estimation error of the convex combination of the leaders' states and outputs, respectively, where * is −( −1 1 2 ⊗ ̄ ) with = ( 1 , ..., ).…”
Section: Distributed Adaptive Observer For Leaders Convex Hullmentioning
confidence: 99%
See 2 more Smart Citations
“…Theorem 2. Consider the leader dynamics ( 4)-( 5), and the adaptive observer ( 35)- (37). Let ̃ = − , ̃ = − be leaders' dynamics estimation errors and ̃ = ̂ − * and ̃ 0 ( ) = ̂ − * be th follower state estimation error of the convex combination of the leaders' states and outputs, respectively, where * is −( −1 1 2 ⊗ ̄ ) with = ( 1 , ..., ).…”
Section: Distributed Adaptive Observer For Leaders Convex Hullmentioning
confidence: 99%
“…Theorem 3. Consider the multi-agent system (2)-( 5) and the distributed adaptive observer (35) along with adaptation laws (36) and (37). Under Assumptions 1 -6, Problem 2 and consequently Problem 1 are solved using optimal control policy (68) with ̄ * given by ( 60) and (61).…”
Section: Note Thatmentioning
confidence: 99%
See 1 more Smart Citation
“…In generative adversarial networks, 24 for instance, two neural networks are essentially playing a competitive game in which they continuously optimize themselves to obtain the Nash equilibrium. Mean field games (MFGs) and stochastic games (SGs) 25 are used to guide the deep reinforcement learning approach. 26 One of the classical models for games where players take turns to move is extensive (-form) games.…”
Section: Introductionmentioning
confidence: 99%
“…An off‐policy reinforcement learning algorithm on an actor‐critic structure is developed to solve an optimal containment control problem using only relative output measurements between agents, without the knowledge of the leaders' exact dynamics. Vamvoudakis et al propose a novel learning‐based adaptive solution to a two‐player differential game. More specifically, this paper introduces an actor/critic learning solution to a continuous‐time open‐loop Stackelberg game.…”
Section: Introductionmentioning
confidence: 99%