Open‐loop Stackelberg learning solution for hierarchical control problems

Vamvoudakis, Kyriakos G.; Lewis, Frank L.; Dixon, Warren E.

doi:10.1002/acs.2831

Cited by 19 publications

(8 citation statements)

References 39 publications

(48 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Theorem 2. Consider the leader dynamics ( 4)-( 5), and the adaptive observer ( 35)- (37). Let ̃ = − , ̃ = − be leaders' dynamics estimation errors and ̃ = ̂ − * and ̃ 0 ( ) = ̂ − * be th follower state estimation error of the convex combination of the leaders' states and outputs, respectively, where * is −( −1 1 2 ⊗ ̄ ) with = ( 1 , ..., ).…”

Section: Distributed Adaptive Observer For Leaders Convex Hullmentioning

confidence: 99%

“…Theorem 3. Consider the multi-agent system (2)-( 5) and the distributed adaptive observer (35) along with adaptation laws (36) and (37). Under Assumptions 1 -6, Problem 2 and consequently Problem 1 are solved using optimal control policy (68) with ̄ * given by ( 60) and (61).…”

Section: Note Thatmentioning

confidence: 99%

“…Another important issue, which is not considered in the existing results for containment control, is designing online solutions that do not require having access to the complete knowledge of the leaders by all followers (fourth limitation). Reinforcement learning (RL) 33,34 has been successfully used to design adaptive optimal controllers for single-agent systems [35][36][37][38][39][40][41][42][43] and multiagent systems [44][45][46][47][48] online. However, to our knowledge, there is no RL-based solution to the optimal output containment control problem.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Observer‐based adaptive optimal output containment control problem of linear heterogeneous Multiagent systems with relative output measurements

Mazouchi

Sistani

Sani

et al. 2018

Adaptive Control & Signal

View full text Add to dashboard Cite

This paper develops a relative output-feedback-based solution to the containment control of linear heterogeneous multiagent systems. A distributed optimal control protocol is presented for the followers to not only assure that their outputs fall into the convex hull of the leaders' output but also optimizes their transient performance. The proposed optimal solution is composed of a feedback part, depending of the followers' state, and a feed-forward part, depending on the convex hull of the leaders' state. To comply with most real-world applications, the feedback and feed-forward states are assumed to be unavailable and are estimated using two distributed observers. That is, a distributed observer is designed to measure each agent's states using only its relative output measurements and the information that it receives by its neighbors. Another adaptive distributed observer is designed, which uses exchange of information between followers over a communication network to estimate the convex hull of the leaders' state.The proposed observer relaxes the restrictive requirement of having access to the complete knowledge of the leaders' dynamics by all the followers. An off-policy reinforcement learning algorithm on an actor-critic structure is next developed to solve the optimal containment control problem online, using relative output measurements and without requiring the leaders' dynamics. Finally, the theoretical results are verified by numerical simulations. KEYWORDSadaptive distributed observer, cooperative output regulation, optimal control, output containment control, reinforcement learning 262

show abstract

Section: Distributed Adaptive Observer For Leaders Convex Hullmentioning

confidence: 99%

Section: Note Thatmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Observer‐based adaptive optimal output containment control problem of linear heterogeneous Multiagent systems with relative output measurements

Mazouchi

Sistani

Sani

et al. 2018

Adaptive Control & Signal

View full text Add to dashboard Cite

show abstract

“…In generative adversarial networks, 24 for instance, two neural networks are essentially playing a competitive game in which they continuously optimize themselves to obtain the Nash equilibrium. Mean field games (MFGs) and stochastic games (SGs) 25 are used to guide the deep reinforcement learning approach. 26 One of the classical models for games where players take turns to move is extensive (-form) games.…”

Section: Introductionmentioning

confidence: 99%

Exploring the effects of computational costs in extensive games via modeling and simulation

et al. 2021

View full text Add to dashboard Cite

Game theory has become a standard tool for depicting and demonstrating various game‐like phenomena by providing appropriate mathematical models and for analyzing and predicting agents' behaviors and their decisions by formalizing solution concepts. The conventional game model mainly concerns ideal systems that would always guarantee optimal responses, which appears unrealistic for practical game scenarios since decision‐making usually entails resource costs. Therefore, this study considers players' decision‐making in extensive games when the computational cost of searching the strategy space is limited. We start with a new mathematical model of extensive games that features a bound on computational resources during players' decision‐making process such that they can only foresee a part of the available alternatives in the future. This model is more appropriate in predicting players' strategies than the conventional model, under which we investigate the effects of computational costs on players' strategies as well as the computational complexity. Furthermore, a simulation experiment is performed to seek the connection between the amount of resources and the goodness of the outcomes. This study is expected to provide a foundation for players' rational decision‐making with computational costs.

show abstract

“…An off‐policy reinforcement learning algorithm on an actor‐critic structure is developed to solve an optimal containment control problem using only relative output measurements between agents, without the knowledge of the leaders' exact dynamics. Vamvoudakis et al propose a novel learning‐based adaptive solution to a two‐player differential game. More specifically, this paper introduces an actor/critic learning solution to a continuous‐time open‐loop Stackelberg game.…”

Section: Introductionmentioning

confidence: 99%

Editorial for the Special Issue on Learning‐based Adaptive Control: Theory and Applications

Benosman

Lewis²,

Guay

et al. 2019

Adaptive Control & Signal

Self Cite

View full text Add to dashboard Cite

The Special Issue presents results of current research on learning-based adaptive methods, merging together model-based and data-driven adaptive approaches. The special issue contains two main types of contributions. The first type of papers presents new theoretical developments for learning-based adaptive algorithms, while the second type focuses on challenging practical applications ranging from UAVs, and autonomous vehicles, to heating and ventilation systems. These papers are compiled in a special issue of the journal. To access all of the papers please follow the following link (https://onlinelibrary.wiley.com/toc/10991115/2019/33/2).

show abstract

Open‐loop Stackelberg learning solution for hierarchical control problems

Cited by 19 publications

References 39 publications

Observer‐based adaptive optimal output containment control problem of linear heterogeneous Multiagent systems with relative output measurements

Observer‐based adaptive optimal output containment control problem of linear heterogeneous Multiagent systems with relative output measurements

Exploring the effects of computational costs in extensive games via modeling and simulation

Editorial for the Special Issue on Learning‐based Adaptive Control: Theory and Applications

Contact Info

Product

Resources

About