muMAB: A Multi-Armed Bandit Model for Wireless Network Selection

Boldrini, Stefano; Nardis, Luca De; Caso, Giuseppe; Le, Mai T. P.; Fiorina, Jocelyn; Benedetto, Maria‐Gabriella Di

doi:10.3390/a11020013

Cited by 23 publications

(16 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Measure-use MAB (muMAB) is proposed in [214] to better adapt MAB models to RAT selection. Classic MABs enable one possible action type in both exploration and exploitation phases, that is, to select an arm and collect the corresponding reward.…”

Section: ) Stateless Mabmentioning

confidence: 99%

User-Centric Radio Access Technology Selection: A Survey of Game Theory Models and Multi-Agent Learning Algorithms

et al. 2021

Self Cite

View full text Add to dashboard Cite

User-centric radio access technology (RAT) selection is a key communication paradigm, given the increased number of available RATs and increased cognitive capabilities at the user end. When considered against traditional network-centric approaches, user-centric RAT selection results in reduced network-side management load, and leads to lower operational costs for RATs, as well as improved quality of service (QoS) and quality of experience (QoE) for users. The complex between-users interactions involved in RAT selection require, however, specific analyses, toward developing reliable and efficient schemes. Two theoretical frameworks are most often applied to user-centric RAT selection analysis, i.e., game theory (GT) and multi-agent learning (MAL). As a consequence, several GT models and MAL algorithms have been recently proposed to solve the problem at hand. A comprehensive discussion of such models and algorithms is, however, currently missing. Moreover, novel issues introduced by next-generation communication systems also need to be addressed. This paper proposes to fill the above gaps by providing a unified reference for both ongoing research and future research directions in the field. In particular, the review addresses the most common GT and MAL models and algorithms, and scenario settings adopted in user-centric RAT selection in terms of utility function and network topology. Regarding GT, the review focuses on non-cooperative models, because of their widespread use in RAT selection; as for MAL, a large number of algorithms are described, ranging from game-theoretic to reinforcement learning (RL) schemes, and also including most recent approaches, such as deep RL (DRL) and multi-armed bandit (MAB). Models and algorithms are analyzed by comparatively reviewing relevant literature. Finally, open challenges are discussed, in light of ongoing research and standardization activities.

show abstract

Section: ) Stateless Mabmentioning

confidence: 99%

User-Centric Radio Access Technology Selection: A Survey of Game Theory Models and Multi-Agent Learning Algorithms

et al. 2021

Self Cite

View full text Add to dashboard Cite

show abstract

“…However, the paper does not address the well-known problems of MPTCP such as network middle boxes and TCP modifiers. Boldrini et al [35] tries to answer the following question: Which wireless network can offer the best performance based on the quality observed by end users between the multiple available wireless networks?. To answer this question, the paper laid the foundations based on two main steps: (1) define QoS/QoE parameters; and (2) define the network selection algorithm.…”

Section: Mobile Wireless Video Streamingmentioning

confidence: 99%

A Survey on Video Streaming in Multipath and Multihomed Overlay Networks

2021

View full text Add to dashboard Cite

The focus of this survey is to study the protocols, mechanisms, and the latest standards proposed in the literature for improving the performance and quality of video content in multipath and multihomed overlay networks. Multipath is a broader term, but in the context of this survey, we define multipath as enhancing network routing technique by using various paths that are not necessarily completely disjoint. Multipath can furnish a variety of advantages such as reliability,connection persistence, increased perceived throughput and load balancing. On the other hand, multihoming is the ability to use multiple network interfaces when connecting to the Internet to increase reliability, resilience, and performance. Most existing surveys are specialized in one specific domain area related to multipath or multihoming. This study covers the research proposals at the different layers/sublayers of an overlay network from transport to the application and extends to cover the latest technologies like machine learning, Fog and Mobile Edge computing, VR 360 video, and the Internet of Multimedia Things (IoMT). As such, our work tries to be as comprehensive as possible to relate multipath and multihoming research solutions for video streaming to the current and emerging video streaming technologies.

show abstract

“…Contextual bandits are a subset of RL algorithms that are considerably simpler: only one step exists before the outcome is observed. The contextual bandit is an extension of the multiarmed bandit approach [28] wherein the context or state information is considered. Unlike in the multiarmed bandits, the state affects how a reward is associated with each action, and therefore, as the states change, the model needs to learn to adapt its action choice.…”

Section: A Reinforcement Learning-based Rrh Selectionmentioning

confidence: 99%

Fuzzy Logic and Accelerated Reinforcement Learning-Based User Association for Dense C-RANs

Rodoshi

Kim

Choi³

2021

IEEE Access

View full text Add to dashboard Cite

Cloud radio access network (C-RAN) is a potential mobile network architecture providing seamless connectivity to users with high data rates by integrating it with the small-cell technology of 5G mobile communication systems. In C-RAN, base station functionality is divided into a baseband unit (BBU) and remote radio head (RRH); the BBUs from multiple sites are centralized and virtualized using cloud computing and virtualization techniques. Frequent handovers occur in the network, which results in control message flooding and repeated service outages because of the dense deployment of short-range RRHs and user mobility. It is necessary to optimize handover control parameters before the handover and re-associate the user with an RRH to minimize unnecessary handovers in the network. Traditional handover schemes rely on signal strengths of RRHs, which cause a large number of unnecessary handovers when a user moves within the coverage of multiple-overlapped RRHs. This study investigates the handover in C-RAN by carefully optimizing the handover control parameter and selecting the target RRH for handover. Our main goal is associating the user with an RRH such that association after the handover remains possible for as long as possible while maintaining the quality of service (QoS) requirements of the users. We have used fuzzy logic to optimize the handover control parameter and a reinforcement learning-based algorithm to select the target RRH. A key component of the proposed RL-based scheme is using an acceleration technique for the faster convergence of the algorithm. Numerical results show that the proposed scheme can significantly reduce the number of handovers while ensuring QoS requirements.

show abstract

muMAB: A Multi-Armed Bandit Model for Wireless Network Selection

Cited by 23 publications

References 23 publications

User-Centric Radio Access Technology Selection: A Survey of Game Theory Models and Multi-Agent Learning Algorithms

User-Centric Radio Access Technology Selection: A Survey of Game Theory Models and Multi-Agent Learning Algorithms

A Survey on Video Streaming in Multipath and Multihomed Overlay Networks

Fuzzy Logic and Accelerated Reinforcement Learning-Based User Association for Dense C-RANs

Contact Info

Product

Resources

About