Deep reinforcement learning (DRL) has been successfully used for the joint routing and resource management in large-scale cognitive radio networks. However, it needs lots of interactions with the environment through trial and error, which results in large energy consumption and transmission delay. In this paper, an apprenticeship learning scheme is proposed for the energy-efficient cross-layer routing design. Firstly, to guarantee energy efficiency and compress huge action space, a novel concept called dynamic adjustment rating is introduced, which regulates transmit power efficiently with multi-level transition mechanism. On top of this, the Prioritized Memories Deep Q-learning from Demonstrations (PM-DQfD) is presented to speed up the convergence and reduce the memory occupation. Then the PM-DQfD is applied to the cross-layer routing design for power efficiency improvement and routing latency reduction. Simulation results confirm that the proposed method achieves higher energy efficiency, shorter routing latency and larger packet delivery ratio compared to traditional algorithms such as Cognitive Radio Q-routing (CRQ-routing), Prioritized Memories Deep Q-Network (PM-DQN), and Conjecture Based Multi-agent Q-learning Scheme (CBMQ).
Transmission latency minimization and energy efficiency improvement are two main challenges in multi-hop Cognitive Radio Networks (CRN), where the knowledge of topology and spectrum statistics are hard to obtain. For this reason, a cross-layer routing protocol based on quasi-cooperative multi-agent learning is proposed in this study. Firstly, to jointly consider the end-to-end delay and power efficiency, a comprehensive utility function is designed to form a reasonable tradeoff between the two measures. Then the joint design problem is modeled as a Stochastic Game (SG), and a quasi-cooperative multi-agent learning scheme is presented to solve the SG, which only needs information exchange with previous nodes. To further enhance performance, experience replay is applied to the update of conjecture belief to break the correlations and reduce the variance of updates. Simulation results demonstrate that the proposed scheme is superior to traditional algorithms leading to a shorter delay, lower packet loss ratio and higher energy efficiency, which is close to the performance of an optimum scheme.
Cognitive Radio (CR) is a promising technology to overcome spectrum scarcity, which currently faces lots of unsolved problems. One of the critical challenges for setting up such systems is how to coordinate multiple protocol layers such as routing and spectrum access in a partially observable environment. In this paper, a deep reinforcement learning approach is adopted for solving above problem. Firstly, for the purpose of compressing huge action space in the cross-layer design problem, a novel concept named responsibility rating is introduced to help decide the transmission power of every Secondary User (SU). In order to deal with problem of dimension curse while reducing replay memory, the Prioritized Memories Deep Q-Network (PM-DQN) is proposed. Furthermore, PM-DQN is applied to solve the joint routing and resource allocation problem in cognitive radio ad hoc network for minimizing the transmission delay and power consumption. Simulation results illustrates that our proposed algorithm can reduce the end-to-end delay, packet loss ratio and estimation error while achieving higher energy efficiency compared with traditional algorithm.
In this paper, the problem of mission planning and spectrum resource allocation for cooperative reconnaissance of ground targets with multiple unmanned aerial vehicles (UAVs) is studied. A joint mission planning and spectrum resource optimization algorithm for multi-UAVs is proposed to improve the information transmission rate by reusing the spectrum of existing users. The joint optimization problem is formulated as mixed-integer nonlinear programming. The block coordinate descent (BCD) method is further applied to achieve the optimal strategies of mission planning, channel allocation, and power control. Specifically, an improved genetic algorithm (GA) combined with the successive convex approximation (SCA) is used to solve the sub-problem of mission planning. For the channel allocation sub-problem, an iterative convergence channel allocation algorithm is proposed. Numerical results show that the proposed algorithm can achieve a higher UAV transmission rate and better robustness than existing algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.