2020 IEEE 92nd Vehicular Technology Conference (VTC2020-Fall) 2020
DOI: 10.1109/vtc2020-fall49728.2020.9348456
|View full text |Cite
|
Sign up to set email alerts
|

Off-policy Learning for Remote Electrical Tilt Optimization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 12 publications
(6 citation statements)
references
References 13 publications
0
5
0
Order By: Relevance
“…There has been a considerable amount of work in the area of antenna tilt optimization. Recent methods are mainly based on the use of Reinforcement Learning (RL) [1,3,8,14,26], Contextual Bandits (CBs), [25] or MABs [5,7,13,23]. Remarkably, RL methods have been actually implemented in real networks and performance gains have been observed [1].…”
Section: B Antenna Tilt Optimizationmentioning
confidence: 99%
See 1 more Smart Citation
“…There has been a considerable amount of work in the area of antenna tilt optimization. Recent methods are mainly based on the use of Reinforcement Learning (RL) [1,3,8,14,26], Contextual Bandits (CBs), [25] or MABs [5,7,13,23]. Remarkably, RL methods have been actually implemented in real networks and performance gains have been observed [1].…”
Section: B Antenna Tilt Optimizationmentioning
confidence: 99%
“…However, the aforementioned papers focus on regret minimization or on identifying an efficient tilt update policy without any consideration for the number of samples used to do so. We should also mention that most existing studies (see e.g, [25]) investigate off-policy learning problems which correspond to our passive learning setting. Since the proposed methods there do not include any stopping rule, the algorithms may actually stop much before they have collected enough data to learn an optimal policy with reasonable confidence.…”
Section: B Antenna Tilt Optimizationmentioning
confidence: 99%
“…The authors of [12] address the RET optimization problem in the off-policy Contextual Multi-Armed Bandit (CMAB) formulation. The goal is to learn a RET policy completely offline from real-world network data.…”
Section: Related Workmentioning
confidence: 99%
“…, and θ t,c , represented in Fig. 1, denotes the downtilt of the antenna at time t and for cell c. The RET problem formulation used in this paper follows previous works in the literature based on Coverage-Capacity Optimization (CCO) [7], [12]. The goal in CCO is to maximize both network coverage and capacity, while minimizing intercell interference.…”
Section: A Network Environment Modelmentioning
confidence: 99%
“…However, RL techniques tend to work better in an incremental fashion, in which the parameter is changed iteratively in small steps, limiting the negative impact of inaccurate reward estimations. A formulation to learn a policy for RET optimization completely offline from real-world network data is successfully applied in [10], although the performance of off-policy learning is highly sensitive to data quality and variability. In [11] a method is proposed based on fuzzy logic combined with a neural network that considers the impact on neighboring cells.…”
Section: Introductionmentioning
confidence: 99%