This article investigates the cache-enabling unmanned aerial vehicle (UAV) cellular networks with massive access capability supported by non-orthogonal multiple access (NOMA). The delivery of a large volume of multimedia contents for ground users is assisted by a mobile UAV base station, which caches some popular contents for wireless backhaul link traffic offloading. In cache-enabling UAV NOMA networks, the caching placement of content caching phase and radio resource allocation of content delivery phase are crucial for network performance. To cope with the dynamic UAV locations and content requests in practical scenarios, we formulate the longterm caching placement and resource allocation optimization problem for content delivery delay minimization as a Markov decision process (MDP). The UAV acts as an agent to take actions for caching placement and resource allocation, which includes the user scheduling of content requests and the power allocation of NOMA users. In order to tackle the MDP, we propose a Q-learning based caching placement and resource allocation algorithm, where the UAV learns and selects action with soft εgreedy strategy to search for the optimal match between actions and states. Since the action-state table size of Q-learning grows with the number of states in the dynamic networks, we propose a function approximation based algorithm with combination of stochastic gradient descent and deep neural networks, which is suitable for large-scale networks. Finally, the numerical results show that the proposed algorithms provide considerable performance compared to benchmark algorithms, and obtain a tradeoff between network performance and calculation complexity.Index termsdynamic resource allocation, non-orthogonal multiple access, reinforcement learning, unmanned aerial vehicle I. INTRODUCTIONWith the explosion of massive multimedia applications and the continuous growth of mobile data traffic, wireless communication faces the problem of limited resources. In order to effectively meet the increasing user demand for high data rate and low access delay, many works [1-5] have paid attention to wireless connectivity from the sky with unmanned aerial vehicles (UAVs). UAVs, also known as remotely piloted aircraft systems (RPAS) or drones, are small pilotless aircrafts
The cache-enabling unmanned aerial vehicle (UAV) non-orthogonal multiple access (NOMA) networks for mixture of augmented reality (AR) and normal multimedia applications are investigated, which is assisted by UAV base stations. The user association, power allocation of NOMA, deployment of UAVs and caching placement of UAVs are jointly optimized to minimize the content delivery delay. A branch and bound (BaB) based algorithm is proposed to obtain the per-slot optimization. To cope with the dynamic content requests and mobility of users in practical scenarios, the original optimization problem is transformed to a Stackelberg game. Specifically, the game is decomposed into a leader level user association sub-problem and a number of power allocation, UAV deployment and caching placement follower level sub-problems. The long-term minimization was further solved by a deep reinforcement learning (DRL) based algorithm. Simulation result shows that the content delivery delay of the proposed BaB based algorithm is much lower than benchmark algorithms, as the optimal solution in each time slot is achieved. Meanwhile, the proposed DRL based algorithm achieves a relatively low long-term content delivery delay in the dynamic environment with lower computation complexity than BaB based algorithm.
Edge caching has become an effective solution to cope with the challenges brought by the massive content delivery in cellular networks. In device-to-device (D2D) enabled caching cellular networks with time-varying content popularity distribution and user terminal (UT) location, we model these dynamic networks as a stochastic game to design a cooperative cache placement policy. We consider the long-term cache placement reward of all UTs in this stochastic game, where each UT becomes an agent and the cache placement policy corresponds to the actions taken by the UTs. Each UT has the same immediate network reward from content caching and sharing. In an effort to solve the stochastic game problem, we propose a multiagent cooperative alternating Q-learning (CAQL) based cache placement algorithm. In CAQL, each UT alternatively updates its own cache placement policy according to the stable policy of other UTs during the learning process, until the stable cache placement policy of all the UTs in the cell is obtained. We discuss the convergence and complexity of CAQL, which obtains the stable cache placement policy with low space complexity. Simulation results show that the proposed algorithm can effectively reduce the backhaul load and the average content access delay in dynamic networks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.