Device-to-device (D2D) technology enables direct communication between devices, which can effectively solve the problem of insufficient spectrum resources in 5G communication technology. Since the channels are shared among multiple D2D user pairs, it may lead to serious interference between D2D user pairs. In order to reduce interference, effectively increase network capacity, and improve wireless spectrum utilization, this paper proposed a distributed resource allocation algorithm with the joint of a deep Q network (DQN) and an unsupervised learning network. Firstly, a DQN algorithm was constructed to solve the channel allocation in the dynamic and unknown environment in a distributed manner. Then, a deep power control neural network with the unsupervised learning strategy was constructed to output an optimized channel power control scheme to maximize the spectrum transmit sum-rate through the corresponding constraint processing. As opposed to traditional centralized approaches that require the collection of instantaneous global network information, the algorithm proposed in this paper used each transmitter as a learning agent to make channel selection and power control through a small amount of state information collected locally. The simulation results showed that the proposed algorithm was more effective in increasing the convergence speed and maximizing the transmit sum-rate than other traditional centralized and distributed algorithms.
In order to solve the resource allocation problem in scenarios of centralized wireless cellular communication with multiple cells, users and channels, a novel resource allocation algorithm based on joint Deep Deterministic Policy Gradient (DDPG) reinforcement learning and unsupervised learning is proposed. Firstly, the proposed algorithm builds a channel allocation deep neural network based on DDPG to provide an optimized channel allocation scheme. Secondly, the proposed algorithm constructs a power control deep neural network based on unsupervised learning to provide an optimized power control scheme. In order to make the unsupervised learning have perceptions on dynamic wireless environments, the experience replay is executed twice to train the channel allocation deep neural network with the DDPG reinforcement learning and the power control deep neural network with the unsupervised learning, respectively. Because the proposed joint algorithm combines the dynamic perception ability of the DDPG reinforcement learning and the continuous optimization ability of unsupervised learning, the energy efficiency can be maximized effectively. Simulation results show that the proposed algorithm outperforms other algorithms in terms of energy efficiency and transmit rate in time-varying dynamic environments.INDEX TERMS Deep reinforcement learning, unsupervised learning, channel allocation, power control, wireless cellular networks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.