SummaryDynamic power allocation (DPA) is the key technique to improve the system throughput by matching the offered capacity with that required among distributed beams in multibeam satellite systems. Existing power allocation studies tend to adopt the metaheuristic optimization algorithms such as the genetic algorithm. The achieved DPA cannot adapt to the dynamic environments due to the varying traffic demands and the channel conditions. To solve this problem, an online algorithm named deep reinforcement learning‐based dynamic power allocation (DRL‐DPA) algorithm is proposed in this paper. The key idea of the proposed DRL‐DPA lies in the online power allocation decision making other than the offline way of the traditional metaheuristic methods. Simulation results show that the proposed DRL‐DPA algorithm can improve the system performance in terms of system throughput and power consumption in multibeam satellite systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.