Task assignment in ground-to-air confrontation based on multiagent deep reinforcement learning

Liu, Jiayi; Wang, Gang; Fu, Qiang; Yue, Shaohua; Wang, Siyuan

doi:10.1016/j.dt.2022.04.001

Cited by 19 publications

(10 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To satisfy the rationality and completeness of the state space and action space and meet the needs of the game confrontation scenario, the state space, action space, and reward function in this paper are designed with reference to the literature [25].…”

Section: Mdp Modelingmentioning

confidence: 99%

“…We equate the CMDP problem in this paper to an unconstrained max-min optimization problem based on the RL algorithm of the literature [25], combined with the PPO-Lagrangian algorithm [33] to solve.…”

Section: ) Ppo-lagrangianmentioning

confidence: 99%

“…Taking the example of a red-side defence mission in a largescale ground-to-air confrontation, the confrontation scenario refers to the literature [25]. Red set up seven long-range interceptor units and five short-range interceptor units to protect a command post and an airfield.…”

Section: A Confrontation Scenario Settingmentioning

confidence: 99%

See 2 more Smart Citations

Deep Reinforcement Learning Task Assignment Based on Domain Knowledge

et al. 2022

Self Cite

View full text Add to dashboard Cite

Deep Reinforcement Learning (DRL) methods are inefficient in the initial strategy exploration process in large-scale complex scenarios. This is becoming one of the bottlenecks in their application to large-scale game adversarial scenarios. This paper proposes a Safe reinforcement learning combined with Imitation learning for Task Assignment (SITA) method for a representative red-blue game confrontation scenario. Aiming at the problem of difficult sampling of Imitation Learning (IL), this paper combines human knowledge with adversarial rules to build a knowledge rule base; We propose the Imitation Learning with the Decoupled Network (ILDN) pre-training method to solve the problem of excessive initial invalid exploration; In order to reduce invalid exploration and improve the stability in the later stages of training, we incorporate Safe Reinforcement Learning (Safe RL) method after pre-training. Finally, we verified in the digital battlefield that the SITA method has higher training efficiency and strong generalization ability in large-scale complex scenarios.

show abstract

Section: Mdp Modelingmentioning

confidence: 99%

Section: ) Ppo-lagrangianmentioning

confidence: 99%

See 1 more Smart Citation

Deep Reinforcement Learning Task Assignment Based on Domain Knowledge

et al. 2022

Self Cite

View full text Add to dashboard Cite

show abstract

“…At the end of training, the critic is no longer used. The algorithm for executing the training of the agents is one of the critical issues studied in this paper and will be described in detail in Section “Model-based model predictive control with proximal policy optimization algorithm.” The training method for the scheduling agent refers to the proximal policy optimization for task assignment of general and narrow agents (PPO-TAGNA) algorithm in the literature ( Liu J. Y. et al, 2022 ) to ensure the training effect and demonstrate more intuitively the changes the executive agent brings.…”

Section: Hierarchical Architecture Design For Agentsmentioning

confidence: 99%

“…A centralized assignment solution is not fast enough, while a fully distributed assignment method does not respond effectively to unexpected events ( Lee et al, 2012 ). The one-general agent with multiple narrow agents (OGMN) architecture proposed in the literature ( Liu J. Y. et al, 2022 ), which divides agents into general and narrow agents, improves the computational speed and coordination ability. However, the narrow agent in the OGMN is entirely rule-driven.…”

Section: Introductionmentioning

confidence: 99%

Intelligent air defense task assignment based on hierarchical reinforcement learning

et al. 2022

Self Cite

View full text Add to dashboard Cite

Modern air defense battlefield situations are complex and varied, requiring high-speed computing capabilities and real-time situational processing for task assignment. Current methods struggle to balance the quality and speed of assignment strategies. This paper proposes a hierarchical reinforcement learning architecture for ground-to-air confrontation (HRL-GC) and an algorithm combining model predictive control with proximal policy optimization (MPC-PPO), which effectively combines the advantages of centralized and distributed approaches. To improve training efficiency while ensuring the quality of the final decision. In a large-scale area air defense scenario, this paper validates the effectiveness and superiority of the HRL-GC architecture and MPC-PPO algorithm, proving that the method can meet the needs of large-scale air defense task assignment in terms of quality and speed.

show abstract

Deep Reinforcement Learning‐Based Air Defense Decision‐Making Using Potential Games

Zhao,

Wang,

et al. 2023

Advanced Intelligent Systems

Self Cite

View full text Add to dashboard Cite

This study addresses the challenge of intelligent decision‐making for command‐and‐control systems in air defense combat operations. Current autonomous decision‐making systems suffer from limited rationality and insufficient intelligence during operation processes. Recent studies have proposed methods based on deep reinforcement learning (DRL) to address these issues. However, DRL methods typically face challenges related to weak interpretability, lack of convergence guarantees, and high‐computing power requirements. To address these issues, a novel technique for large‐scale air defense decision‐making by combining a DRL technique with game theory is discussed. The proposed method transforms the target assignment problem into a potential game that provides theoretical guarantees for Nash equilibrium (NE) from a distributed perspective. The air‐defense decision problem is decomposed into separate target selection and target assignment problems. A DRL method is used to solve the target selection problem, while the target assignment problem is translated into a target assignment optimization game. This game is proven to be an exact potential game with theoretical convergence guarantees for an NE. Having simulated the proposed decision‐making method using a digital battlefield environment, the effectiveness of the proposed method is demonstrated.

show abstract

Task assignment in ground-to-air confrontation based on multiagent deep reinforcement learning

Cited by 19 publications

References 7 publications

Deep Reinforcement Learning Task Assignment Based on Domain Knowledge

Deep Reinforcement Learning Task Assignment Based on Domain Knowledge

Intelligent air defense task assignment based on hierarchical reinforcement learning

Deep Reinforcement Learning‐Based Air Defense Decision‐Making Using Potential Games

Contact Info

Product

Resources

About