2020
DOI: 10.48550/arxiv.2005.10577
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Off-policy Learning for Remote Electrical Tilt Optimization

Abstract: We address the problem of Remote Electrical Tilt (RET) optimization using off-policy Contextual Multi-Armed-Bandit (CMAB) techniques. The goal in RET optimization is to control the orientation of the vertical tilt angle of the antenna to optimize Key Performance Indicators (KPIs) representing the Quality of Service (QoS) perceived by the users in cellular networks. Learning an improved tilt update policy is hard. On the one hand, coming up with a new policy in an online manner in a real network requires explor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 14 publications
(19 reference statements)
0
4
0
Order By: Relevance
“…When using Reinforcement learning, the problem of coordination still arises. Considering each antenna as independent learning agents have been used in the past to address the problem of optimizing mobile networks [6], [8], [9], [11], [12], hence failing to capture phenomena like interference. Learning algorithms leveraging coordination can use a centralized controller [7], [13] which does not scale to a large number of agents.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…When using Reinforcement learning, the problem of coordination still arises. Considering each antenna as independent learning agents have been used in the past to address the problem of optimizing mobile networks [6], [8], [9], [11], [12], hence failing to capture phenomena like interference. Learning algorithms leveraging coordination can use a centralized controller [7], [13] which does not scale to a large number of agents.…”
Section: Related Workmentioning
confidence: 99%
“…Existing approaches for network optimization rely on hand-engineered strategies which are suboptimal and hard to scale [2]- [4]. Methods relying on mathematical models [5] or reinforcement learning (RL) are also used for network optimization [6]- [11]. They are more robust and principled.…”
Section: Introductionmentioning
confidence: 99%
“…However, it is known that the large-scale exploration performed by RL algorithms can sometimes take the system to unsafe states [7]. In the problem of RET optimization, RL has been proven to be an effective framework for KPI optimization due to its self-learning capabilities and adaptivity to potential environment changes [16]. For addressing the safety problem (i.e., to guarantee that the desired KPIs remain in specified bounds) authors in [16] have proposed a statistical approach to empirically evaluate the RET optimization in different baseline policies and in different worst-case scenarios.…”
Section: Introductionmentioning
confidence: 99%
“…However, they assume that the system dynamics abstraction into an MDP is given, which is challenging in network applications that this demonstration refers to. As mentioned previously, authors in [16] address the safe RET optimization problem, but this approach relies on statistical guarantees and it cannot handle general LTL specifications that we treat with this manuscript.…”
Section: Introductionmentioning
confidence: 99%