In order to solve the optimization of the interference resource allocation in communication network countermeasures, an interference resource allocation method based on the maximum policy entropy deep reinforcement learning (MPEDRL) was proposed. The method introduced the idea of deep reinforcement learning into the communication countermeasures resource allocation, it could enhance the exploration of the policy and accelerate the convergence to the global optimum with adding the maximum policy entropy criterion and adaptively adjusting the entropy coefficient. The method modeled interference resource allocation as Markov decision process, then established the interference strategy network to output allocation scheme, constructing the interference effect evaluation network of the clipped twin structure for efficiency evaluation, and trained the policy network and the evaluation network with the goal of maximizing the strategy entropy and the cumulative interference efficacy, then decided the optimal interference resource allocation scheme. The simulation results show that the algorithm can effectively solve the resource allocation problem in communication network confrontation, comparing with the existing deep reinforcement learning methods, it has faster learning speed and less fluctuation in the training process, and achieved 15% higher jamming efficacy than DDPG-based method.
In order to solve the problem of intelligent anti-jamming decision-making in battlefield communication, this paper designs an intelligent decision-making method for communication anti-jamming based on deep reinforcement learning. Introducing experience replay and dynamic epsilon mechanism based on PHC under the framework of DQN algorithm, a dynamic epsilon-DQN intelligent decision-making method is proposed. The algorithm can better select the value of epsilon according to the state of the decision network and improve the convergence speed and decision success rate. During the decision-making process, the jamming signals of all communication frequencies are detected, and the results are input into the decision-making algorithm as jamming discriminant information, so that we can effectively avoid being jammed under the condition of no prior jamming information. The experimental results show that the proposed method adapts to various communication models, has a fast decision-making speed, and the average success rate of the convergent algorithm can reach more than 95%, which has a great advantage over the existing decision-making methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.