The ultimate goal of military intelligence is to equip the command and control (C2) system with the decision-making art of excellent human commanders and to be more agile and stable than human beings. Intelligent commander Alpha C2 solves the dynamic decision-making problem in the complex scenarios of air defense operations using a deep reinforcement learning framework. Unlike traditional C2 systems that rely on expert rules and decision-making models, Alpha C2 interacts with digital battlefields close to the real world and generates learning data. By integrating the states of multiple parties as input, a gated recurrent unit network is used to introduce historical information, and an attention mechanism selects the object of action, making the output decision more reliable. Without learning human combat experience, the neural network is trained in fixed-and random-strategy scenarios based on a proximal policy optimization algorithm. Finally, 1,000 rounds of offline confrontation were conducted on a digital battlefield, whose results show that the generalization ability of Alpha C2 trained using a random strategy is better, and that it can defeat an opponent with a higher winning rate than an Expert C2 system (72% vs 21%). The use of resources is more reasonable than Expert C2, reflecting the flexible and changeable art of command. INDEX TERMS Intelligent decision-making, deep reinforcement learning, Alpha C2, expert system.
The elephant herding optimization (EHO) algorithm is a novel metaheuristic optimizer inspired by the clan renewal and separation behaviors of elephant populations. Although it has few parameters and is easy to implement, it suffers from a lack of exploitation, leading to slow convergence. This paper proposes an improved EHO algorithm called manta ray foraging and Gaussian mutation-based EHO for global optimization (MGEHO). The clan updating operator in the original EHO algorithm is replaced by the somersault foraging strategy of manta rays, which aims to optimally adjust patriarch positions. Additionally, a dynamic convergence factor is set to balance exploration and exploitation. The gaussian mutation is adopted to enhance the population diversity, enabling MGEHO to maintain a strong local search capability. To evaluate the performances of different algorithms, 33 classical benchmark functions are chosen to verify the superiority of MGEHO. Also, the enhanced paradigm is compared with other advanced metaheuristic algorithms on 32 benchmark functions from IEEE CEC2014 and CEC2017. Furthermore, a scalability test, convergence analysis, statistical analysis, diversity analysis, and running time analysis demonstrate the effectiveness of MGEHO from various aspects. The results illustrate that MGEHO is superior to other algorithms in terms of solution accuracy and stability. Finally, MGEHO is applied to solve three real engineering problems. The comparison results show that this method is a powerful auxiliary tool for handling complex problems.
A multisensor scheduling algorithm based on the hybrid task decomposition and modified binary particle swarm optimization (MBPSO) is proposed. Firstly, aiming at the complex relationship between sensor resources and tasks, a hybrid task decomposition method is presented, and the resource scheduling problem is decomposed into subtasks; then the sensor resource scheduling problem is changed into the match problem of sensors and subtasks. Secondly, the resource match optimization model based on the sensor resources and tasks is established, which considers several factors, such as the target priority, detecting benefit, handover times, and resource load. Finally, MBPSO algorithm is proposed to solve the match optimization model effectively, which is based on the improved updating means of particle’s velocity and position through the doubt factor and modified Sigmoid function. The experimental results show that the proposed algorithm is better in terms of convergence velocity, searching capability, solution accuracy, and efficiency.
Clustering analysis is essential for obtaining valuable information from a predetermined dataset. However, traditional clustering methods suffer from falling into local optima and an overdependence on the quality of the initial solution. Given these defects, a novel clustering method called gradient-based elephant herding optimization for cluster analysis (GBEHO) is proposed. A well-defined set of heuristics is introduced to select the initial centroids instead of selecting random initial points. Specifically, the elephant optimization algorithm (EHO) is combined with the gradient-based algorithm GBO for assigning initial cluster centers across the search space. Second, to overcome the imbalance between the original EHO exploration and exploitation, the initialized population is improved by introducing Gaussian chaos mapping. In addition, two operators, i.e., random wandering and variation operators, are set to adjust the location update strategy of the agents. Nine datasets from synthetic and real-world datasets are adopted to evaluate the effectiveness of the proposed algorithm and the other metaheuristic algorithms. The results show that the proposed algorithm ranks first among the 10 algorithms. It is also extensively compared with state-of-the-art techniques, and four evaluation criteria of accuracy rate, specificity, detection rate, and F-measure are used. The obtained results clearly indicate the excellent performance of GBEHO, while the stability is also more prominent.
Deep Reinforcement Learning (DRL) methods are inefficient in the initial strategy exploration process in large-scale complex scenarios. This is becoming one of the bottlenecks in their application to large-scale game adversarial scenarios. This paper proposes a Safe reinforcement learning combined with Imitation learning for Task Assignment (SITA) method for a representative red-blue game confrontation scenario. Aiming at the problem of difficult sampling of Imitation Learning (IL), this paper combines human knowledge with adversarial rules to build a knowledge rule base; We propose the Imitation Learning with the Decoupled Network (ILDN) pre-training method to solve the problem of excessive initial invalid exploration; In order to reduce invalid exploration and improve the stability in the later stages of training, we incorporate Safe Reinforcement Learning (Safe RL) method after pre-training. Finally, we verified in the digital battlefield that the SITA method has higher training efficiency and strong generalization ability in large-scale complex scenarios.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.