Off-policy Learning for Remote Electrical Tilt Optimization

Vannella, Filippo; Jeong, Jaeseong; Proutière, Alexandre

doi:10.1109/vtc2020-fall49728.2020.9348456

Cited by 12 publications

(6 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There has been a considerable amount of work in the area of antenna tilt optimization. Recent methods are mainly based on the use of Reinforcement Learning (RL) [1,3,8,14,26], Contextual Bandits (CBs), [25] or MABs [5,7,13,23]. Remarkably, RL methods have been actually implemented in real networks and performance gains have been observed [1].…”

Section: B Antenna Tilt Optimizationmentioning

confidence: 99%

“…However, the aforementioned papers focus on regret minimization or on identifying an efficient tilt update policy without any consideration for the number of samples used to do so. We should also mention that most existing studies (see e.g, [25]) investigate off-policy learning problems which correspond to our passive learning setting. Since the proposed methods there do not include any stopping rule, the algorithms may actually stop much before they have collected enough data to learn an optimal policy with reasonable confidence.…”

Section: B Antenna Tilt Optimizationmentioning

confidence: 99%

See 1 more Smart Citation

Learning Optimal Antenna Tilt Control Policies: A Contextual Linear Bandit Approach

Vannella¹,

Proutière²,

Jedra³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Section: B Antenna Tilt Optimizationmentioning

confidence: 99%

Section: B Antenna Tilt Optimizationmentioning

confidence: 99%

Learning Optimal Antenna Tilt Control Policies: A Contextual Linear Bandit Approach

Vannella¹,

Proutière²,

Jedra³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

“…The authors of [12] address the RET optimization problem in the off-policy Contextual Multi-Armed Bandit (CMAB) formulation. The goal is to learn a RET policy completely offline from real-world network data.…”

Section: Related Workmentioning

confidence: 99%

“…, and θ t,c , represented in Fig. 1, denotes the downtilt of the antenna at time t and for cell c. The RET problem formulation used in this paper follows previous works in the literature based on Coverage-Capacity Optimization (CCO) [7], [12]. The goal in CCO is to maximize both network coverage and capacity, while minimizing intercell interference.…”

Section: A Network Environment Modelmentioning

confidence: 99%

Remote Electrical Tilt Optimization via Safe Reinforcement Learning

Vannella¹,

Iakovidis²,

Håkim³

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

Remote Electrical Tilt (RET) optimization is an efficient method for adjusting the vertical tilt angle of Base Stations (BSs) antennas in order to optimize Key Performance Indicators (KPIs) of the network. Reinforcement Learning (RL) provides a powerful framework for RET optimization because of its self-learning capabilities and adaptivity to environmental changes. However, an RL agent may execute unsafe actions during the course of its interaction, i.e., actions resulting in undesired network performance degradation. Since the reliability of services is critical for Mobile Network Operators (MNOs), the prospect of performance degradation has prohibited the realworld deployment of RL methods for RET optimization. In this work, we model the RET optimization problem in the Safe Reinforcement Learning (SRL) framework with the goal of learning a tilt control strategy providing performance improvement guarantees with respect to a safe baseline. We leverage a recent SRL method, namely Safe Policy Improvement through Baseline Bootstrapping (SPIBB), to learn an improved policy from an offline dataset of interactions collected by the safe baseline. Our experiments show that the proposed approach is able to learn a safe and improved tilt update policy, providing a higher degree of reliability and potential for real-world network deployment.

show abstract

“…However, RL techniques tend to work better in an incremental fashion, in which the parameter is changed iteratively in small steps, limiting the negative impact of inaccurate reward estimations. A formulation to learn a policy for RET optimization completely offline from real-world network data is successfully applied in [10], although the performance of off-policy learning is highly sensitive to data quality and variability. In [11] a method is proposed based on fuzzy logic combined with a neural network that considers the impact on neighboring cells.…”

Section: Introductionmentioning

confidence: 99%

Multi-Agent Reinforcement Learning with Common Policy for Antenna Tilt Optimization

Mendo¹,

Outes-Carnero²,

Ng-Molina³

et al. 2023

Preprint

View full text Add to dashboard Cite

This paper proposes a method for wireless network optimization applicable to tuning cell parameters that impact the performance of the adjusted cell and the surrounding neighboring cells. The method relies on multiple reinforcement learning agents that share a common policy and include information from neighboring cells in the state and reward. In order not to impair network performance during the first steps of learning, agents are pre-trained during an earlier phase of offline learning, in which an initial policy is obtained using feedback from a static network simulator and considering a wide variety of scenarios. Finally, agents can wisely tune the cell parameters of a test network by suggesting small incremental changes to slowly steer the network toward an optimal configuration. Agents propose optimal changes using the experience gained with the simulator in the pre-training phase, but also continue to learn from current network readings after each change. The results show how the proposed approach significantly improves the performance gains already provided by expert system-based methods when applied to remote antenna tilt optimization. Additional gains are also seen when comparing the proposed approach with a similar method in which the state and reward do not include information from neighboring cells.

show abstract

Off-policy Learning for Remote Electrical Tilt Optimization

Cited by 12 publications

References 13 publications

Learning Optimal Antenna Tilt Control Policies: A Contextual Linear Bandit Approach

Learning Optimal Antenna Tilt Control Policies: A Contextual Linear Bandit Approach

Remote Electrical Tilt Optimization via Safe Reinforcement Learning

Multi-Agent Reinforcement Learning with Common Policy for Antenna Tilt Optimization

Contact Info

Product

Resources

About