Online Synchronous Approximate Optimal Learning Algorithm for Multi-Player Non-Zero-Sum Games With Unknown Dynamics

Liu, Derong; Li, Hongliang; Wang, Ding

doi:10.1109/tsmc.2013.2295351

Cited by 217 publications

(69 citation statements)

References 46 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In [43] the optimal learning algorithm based on policy iteration is used to solve a multiplayer nonzero-sum game without the requirement of exact knowledge of dynamical systems. In [74] a near-optimal control scheme is proposed to solve the nonzero-sum differential games of continuous time nonlinear systems by using the neural networks.…”

Section: Chapter 5 Multiplayer Game In Orbitmentioning

confidence: 99%

Game Theoretic Strategies for Spacecraft Rendezvous and Motion Synchronization

Innocenti

Tartaglia

2016

AIAA Guidance, Navigation, and Control Conference

View full text Add to dashboard Cite

One of the main challenges for autonomous spacecraft relative guidance and control is extending the algorithms for autonomous rendezvous and docking (AR&D) operations to multiple collaborative spacecraft.In this thesis, the autonomous rendezvous problem, between two active spacecraft, is formulated as a two player nonzero-sum differential game. The local-vertical local-horizontal (LVLH) rotating reference frame is used to describe the dynamic of the game.The State-Dependent Riccati equation (SDRE) method is applied to extend the Linear Quadratic differential game theory to obtain a feedback control law for nonlinear equation of relative motion. In the simulations both the spacecraft use continuous thrust engines. A comparison among Pareto and Nash equilibrium has been performed.A multiplayer sequential game strategy is used to extend the control law to many spacecraft for relative motion synchronization in an on-orbit self assembly strategy. 2 SommarioUno dei possibili sviluppi della guida e del controllo relativo nello spazio è quella di estendere gli algoritmi per operazioni di rendezvous e di docking autonome a più veicoli spaziali che collaborano tra di loro. Il problema del rendezvous tra due veicoli spaziali viene risolto utilizzando la teoria dei giochi differenziali lineari quadratici. La dinamica del gioco viene descritta in un sistema di riferimento cartesiano non inerziale.Per estendere l'utilizzo della teoria dei giochi differenziali lineari quadratici alle equazioni non lineari di moto relativo è stata utilizzata le tecnica di parametrizzazione in funzione dello stato o linearizzazione estesa. Nelle simulazioni è stato valutato il confronto tra le prestazioni e le traiettorie ottenute con l'equilibrio di Pareto e quello di Nash quando entrambi i veicoli spaziali agiscono sotto spinta continua.Una strategia sequenziale è stata utilizzata per estendere il gioco differenziali a più di due giocatori per avere la sincronizzazione del moto relativo durante operazioni di assemblaggio nello spazio.3

show abstract

Section: Chapter 5 Multiplayer Game In Orbitmentioning

confidence: 99%

Game Theoretic Strategies for Spacecraft Rendezvous and Motion Synchronization

Innocenti

Tartaglia

2016

AIAA Guidance, Navigation, and Control Conference

View full text Add to dashboard Cite

show abstract

“…Policy iteration can be employed to reduce the computational cost and continuously update control policies by evaluating the interaction performance [20]. Methods of policy iteration for games with known and unknown dynamics have been developed by several research groups [21], [22], [23]. As mentioned above, however, the human's objective is generally unknown to the robot in a typical human-robot interaction scenario.…”

Section: Introductionmentioning

confidence: 99%

A Framework of Human–Robot Coordination Based on Game Theory and Policy Iteration

Tee

Yan

et al. 2016

IEEE Trans. Robot.

View full text Add to dashboard Cite

Abstract-In this paper, we propose a framework to analyze the interactive behaviors of human and robot in physical interactions. Game theory is employed to describe the system under study, and policy iteration is adopted to provide a solution of Nash equilibrium. The human's control objective is estimated based on the measured interaction force, and it is used to adapt the robot's objective such that human-robot coordination can be achieved. The validity of the proposed method is verified through a rigorous proof and experimental studies.

show abstract

“…Additionally, in the field of optimal control, adaptive dynamic programming (ADP) [18][19][20][21] is a significant and hot topic. Many data-driven and model-free methods based on ADP have been established [22][23][24][25][26][27][28][29][30][31][32]. Different from the above methods, virtual reference feedback tuning (VRFT), which is originally proposed by Guardabassi and Savaresi [33], provides a global solution to a model reference control problem with oneshot off-line data.…”

Section: Introductionmentioning

confidence: 99%