Abstract:The frequent handover and handover failure problems obviously degrade the QoS of mobile users in the terrestrial segment (e.g., cellular networks) of satellite-terrestrial integrated networks (STINs). And the traditional handover decision methods rely on the historical data and produce the training cost. To solve these problems, the deep reinforcement learning- (DRL-) based handover decision methods are used in the handover management. In the existing DQN-based handover decision method, the overestimates of DQ… Show more
“…The Q-learning method is not suitable for the decision problem with high dimension state space. The DQN method replaces the Q table with DNN to describe the action value function, which is used to solve the decision problem with high dimension state space [32].…”
Section: Related Workmentioning
confidence: 99%
“…The global information is used in the training process of decentralized policy used in UE. In [32], Wu et al proposed a load balancing-based double deep Q-network (LB-DDQN) method for handover decision. In the proposed load balancing strategy, a load coefficient is defined to express the conditions of loading in each base station.…”
The traditional handover decision methods depend on the handover threshold and measurement reports, which cannot efficiently resolve the frequent handover issue and ping-pong effect in 5G (5 generation) ultradense networks. To reduce the unnecessary handover and improve the QoS (quality of service), combine with the analysis of dwell time, we propose a state aware-based prioritized experience replay (SA-PER) handover decision method. First, the cell dwell time is computed by the geometrical analysis of real-time locations of mobile users in cellular networks. The constructed state aware sequence including SINR, load coefficient, and dwell time is normalized by max-min normalization method. Then, the handover decision problem in 5G ultradense networks is formalized as a discrete Markov decision process (MDP). The random sampling and small batch sampling affect the performance of deep reinforcement learning methods. We adopt the prioritized experience replay (PER) method to resolve the learning efficiency problems. The state space, action space, and reward functions are designed. The normalized state aware decision matrix inputs the DDQN (double deep Q-network) method. The competitive and collaborative relationships between vertical handover and horizontal handover in 5G ultradense networks are mainly discussed. And the high average network throughput and long average cell dwell time make sure of the communication quality for mobile users.
“…The Q-learning method is not suitable for the decision problem with high dimension state space. The DQN method replaces the Q table with DNN to describe the action value function, which is used to solve the decision problem with high dimension state space [32].…”
Section: Related Workmentioning
confidence: 99%
“…The global information is used in the training process of decentralized policy used in UE. In [32], Wu et al proposed a load balancing-based double deep Q-network (LB-DDQN) method for handover decision. In the proposed load balancing strategy, a load coefficient is defined to express the conditions of loading in each base station.…”
The traditional handover decision methods depend on the handover threshold and measurement reports, which cannot efficiently resolve the frequent handover issue and ping-pong effect in 5G (5 generation) ultradense networks. To reduce the unnecessary handover and improve the QoS (quality of service), combine with the analysis of dwell time, we propose a state aware-based prioritized experience replay (SA-PER) handover decision method. First, the cell dwell time is computed by the geometrical analysis of real-time locations of mobile users in cellular networks. The constructed state aware sequence including SINR, load coefficient, and dwell time is normalized by max-min normalization method. Then, the handover decision problem in 5G ultradense networks is formalized as a discrete Markov decision process (MDP). The random sampling and small batch sampling affect the performance of deep reinforcement learning methods. We adopt the prioritized experience replay (PER) method to resolve the learning efficiency problems. The state space, action space, and reward functions are designed. The normalized state aware decision matrix inputs the DDQN (double deep Q-network) method. The competitive and collaborative relationships between vertical handover and horizontal handover in 5G ultradense networks are mainly discussed. And the high average network throughput and long average cell dwell time make sure of the communication quality for mobile users.
“…The authors of [18] analyzed the transmission characteristics of terrestrial and back-haul links to propose a greedy-based user association algorithm and a matching algorithm with user grouping for balancing the load by performing multiple iterations between users and cells. In [19], the authors noted that the current methods adopt the greedy strategy, which leads to the load imbalance problem in cells. Thus, they defined a load coefficient and added it to the reward function to make handover decisions while balancing loads.…”
With the development of communication systems, users are becoming more widely distributed and require higher speed networks. A satellite–terrestrial integrated network could provide seamless coverage for these users. In previous studies of load balancing, initial access and load balancing are decided on based on signal reception and are performed reactively after the overloading occurs, which may not work well in satellite–terrestrial integrated networks. Therefore, this paper proposes a fuzzy-logic-based load balancing scheme. In this scheme, a fuzzy evaluation metric to pre-evaluate the user’s impact on overload is presented. The fuzzy logic system is constructed based on adaptive neuro fuzzy system, which takes the user’s signal reception, speed and data requirement as inputs. Then, the fuzzy-logic- and reinforcement-learning-based access is proposed to give an access decision for all users in the network to prevent overloading. Due to the large dimensions of action space, the reinforcement learning model is trained by the proposed fuzzy, deep, deterministic policy gradient. Next, the fuzzy-logic-based offloading algorithm is proposed to balance load after overloading. A simulation platform is established to evaluate the performance. Simulation results indicate that the proposed scheme can ensure load balance for a longer time than base line schemes while ensuring data rate of users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.