This paper addresses the problem of suppression of the integrated air defense system (IADS) by multiple fighters' cooperation. Considering the dynamic changing of the number of the nodes in the operational process, a profit model for the influence of the mission's cost for the whole system is developed for both offense and defensive sides. The scenario analysis is given for the process of suppressing the IADS by multiple fighters. Based on this scenario analysis, the modeling method and the specific expression for the payoff function are proposed in four cases for each node. Moreover, a distributed virtual learning algorithm is designed for the n-person and n-strategy game, and the mixed strategy Nash equilibrium (MSNE) of this game can be solved from the n × m × 3-dimensional profit space. Finally, the simulation examples are provided to demonstrate the effectiveness of the proposed model and the game algorithm.
Unmanned aerial vehicle (UAV) swarm cooperative decision-making has attracted increasing attentions because of its low-cost, reusable, and distributed characteristics. However, existing non-learningbased methods rely on small-scale, known scenarios, and cannot solve complex multi-agent cooperation problem in large-scale, uncertain scenarios. This paper proposes a hierarchical multi-agent reinforcement learning (HMARL) method to solve the heterogeneous UAV swarm cooperative decision-making problem for the typical suppression of enemy air defense (SEAD) mission, which is decoupled into two subproblems, i.e., the higher-level target allocation (TA) sub-problem and the lower-level cooperative attacking (CA) sub-problem. We establish a HMARL agent model, consisting of a multi-agent deep Q network (MADQN) based TA agent and multiple independent asynchronous proximal policy optimization (IAPPO) based CA agents. MADQN-TA agent can dynamically adjust the TA schemes according to the relative position. To encourage exploration and promote learning efficiency, the Metropolis criterion and inter-agent information exchange techniques are introduced. IAPPO-CA agent adopts independent learning paradigm, which can easily scale with the number of agents. Comparative simulation results validate the effectiveness, robustness, and scalability of the proposed method.INDEX TERMS UAV swarm; suppression of enemy air defense; deep reinforcement learning; multi-agent; hierarchical reinforcement learning
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.