Age of Information Aware VNF Scheduling in Industrial IoT Using Deep Reinforcement Learning

Akbari, Mohammad; Abedi, Mohammad Reza; Joda, Roghayeh; Pourghasemian, Mohsen; Mokari, Nader; Erol‐Kantarci, Melike

doi:10.1109/jsac.2021.3087264

Cited by 49 publications

(12 citation statements)

References 38 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Different from the existing DRL algorithms which consider either purely discrete [22], [30], [31] or purely continuous actions [12], [32], [33], we here study a more practical DRL setting with a hybrid discrete-continuous action for improving the training performance. Even though such a hybrid action setting has been previously mentioned in a few related works such as [34], a holistic investigation on the sampling of discrete and continuous actions has not been given. Therefore, we propose a new parameterized advantage actor critic (A2C) algorithm to optimize the system latency with actor and critic designs.…”

Section: Drl Design With Parameterized A2c For Bflmentioning

confidence: 99%

Latency Optimization for Blockchain-Empowered Federated Learning in Multi-Server Edge Computing

Nguyen¹,

Hosseinalipour²,

Love³

et al. 2022

Preprint

View full text Add to dashboard Cite

In this paper, we study a new latency optimization problem for Blockchain-based federated learning (BFL) in multiserver edge computing. In this system model, distributed mobile devices (MDs) communicate with a set of edge servers (ESs) to handle both machine learning (ML) model training and block mining simultaneously. To assist the ML model training for resource-constrained MDs, we develop an offloading strategy that enables MDs to transmit their data to one of the associated ESs. We then propose a new decentralized ML model aggregation solution at the edge layer based on a consensus mechanism to build a global ML model via peer-to-peer (P2P)-based Blockchain communications. We then formulate latency-aware BFL as an optimization aiming to minimize the system latency via joint consideration of the data offloading decisions, MDs' transmit power, channel bandwidth allocation for MDs' data offloading, MDs' computational allocation, and hash power allocation. To address the mixed action space of discrete offloading and continuous allocation variables, we propose a novel deep reinforcement learning scheme with a holistic design of a parameterized advantage actor critic (A2C) algorithm. Additionally, we theoretically characterize the convergence properties of the proposed BFL system in terms of the aggregation delay, mini-batch size, and number of P2P communication rounds. Our subsequent numerical evaluation demonstrates the superior performance of our proposed scheme over existing approaches in terms of model training efficiency, convergence rate, and system latency.

show abstract

Section: Drl Design With Parameterized A2c For Bflmentioning

confidence: 99%

Latency Optimization for Blockchain-Empowered Federated Learning in Multi-Server Edge Computing

Nguyen¹,

Hosseinalipour²,

Love³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…. , R M } by (24); problem that M agents cannot be constructed simultaneously due to the limited of CPU resources and storage space, when M is large.…”

Section: Model Training and Parameter Migrationmentioning

confidence: 99%

“…Dalgkitsis et al [23] leveraged the deep deterministic policy gradient algorithm to implement dynamic resource-aware VNF placement. Akbari et al [24] studied a VNF placement and scheduling problem in an industrial internet-of-things network, and applied an actor-critic RL to jointly minimize the VNF cost and the age of information.…”

Section: Introductionmentioning

confidence: 99%

Multi-Agent Deep Reinforcement Learning for Cost- and Delay-Sensitive Virtual Network Function Placement and Routing

Wang¹,

Yuen²,

Ni³

et al. 2022

Preprint

View full text Add to dashboard Cite

This paper proposes an effective and novel multiagent deep reinforcement learning (MADRL)-based method for solving the joint virtual network function (VNF) placement and routing (P&R), where multiple service requests with differentiated demands are delivered at the same time. The differentiated demands of the service requests are reflected by their delay-and cost-sensitive factors. We first construct a VNF P&R problem to jointly minimize a weighted sum of service delay and resource consumption cost, which is NP-complete. Then, the joint VNF P&R problem is decoupled into two iterative subtasks: placement subtask and routing subtask. Each subtask consists of multiple concurrent parallel sequential decision processes. By invoking the deep deterministic policy gradient method and multi-agent technique, an MADRL-P&R framework is designed to perform the two subtasks. The new joint reward and internal rewards mechanism is proposed to match the goals and constraints of the placement and routing subtasks. We also propose the parameter migration-based model-retraining method to deal with changing network topologies. Corroborated by experiments, the proposed MADRL-P&R framework is superior to its alternatives in terms of service cost and delay, and offers higher flexibility for personalized service demands. The parameter migration-based model-retraining method can efficiently accelerate convergence under moderate network topology changes.

show abstract

“…Furthermore, dealing with age-optimal scheduling problem using reinforcement learning approaches in an unknown environment has recently drawn great attention [6], [20]- [28] and to the best of our knowledge, the first application of RL approaches to the problem with a minimum AoI criterion appeared in [6], which employed the average-cost SARSA with softmax algorithm to learn the system parameters and the transmission policy under hybrid ARQ (HARQ) protocols.…”

Section: Introductionmentioning

confidence: 99%

“…In [27], an underwater linear network was considered, in which the authors developed an actor-critic DRL based on a deep deterministic policy gradient (DDPG) method to minimize the normalized weighted sum AoI. In [28], the authors developed single-agent and cooperative multi-agent virtual network function (VNF) placement utilizing DRL method to minimize VNF placement cost, scheduling cost, and average AoI in industrial internet of things (IIoT). However, none of the multi-user system works consider NOMA transmission scheme when using RL to solve AoI minimization problems in the unknown environment.…”

Section: Introductionmentioning

confidence: 99%

How to Minimize the Weighted Sum AoI in Multi-Source Status Update Systems: OMA or NOMA?

Wang¹,

Qiao²

2022

Preprint

View full text Add to dashboard Cite

In this paper, the minimization of the weighted sum average age of information (AoI) in a multisource status update communication system is studied. Multiple independent sources send update packets to a common destination node in a time-slotted manner under the limit of maximum retransmission rounds. Different multiple access schemes, i.e., orthogonal multiple access (OMA) and non-orthogonal multiple access (NOMA) are exploited here over a block-fading multiple access channel (MAC). Constrained Markov decision process (CMDP) problems are formulated to describe the AoI minimization problems considering both transmission schemes. The Lagrangian method is utilised to convert CMDP problems to unconstraint Markov decision process (MDP) problems and corresponding algorithms to derive the power allocation policies are obtained. On the other hand, for the case of unknown environments, two online reinforcement learning approaches considering both multiple access schemes are proposed to achieve near-optimal age performance. Numerical simulations validate the improvement of the proposed policy in terms of weighted sum AoI compared to the fixed power transmission policy, and illustrate that NOMA is more favorable in case of larger packet size. 2 networks and disaster monitoring and alerting systems, strictly guaranteeing the timeliness of information updates is crucial since outdated information might become worthless. From the perspective of system, the knowledge of the status of a remote sensor or system requires to be as timely as possible, so the timeliness of state updates has evolved into a new field of network research [3]. To characterize such information timeliness and freshness, the metric termed age of information (AoI), typically defined as the time elapsed since the most recent successfully received system information was generated at the source, has been proposed [4].Most of the earlier work on AoI in various networks mainly consider simple single-source single-destination status update system models (see, e.g., [4]-[8]), while recent researches related to AoI optimization have shifted to more practical multi-source and/or multi-destination systems and most of them involve orthogonal multiple access (OMA) technique [9]-[14]. For instance, the authors in [9] considered a system model in which a central controller collects data from multiple sensors via wireless links and the AoI optimization problem is subject to both bandwidth and power consumption constraints. Besides, in [9], a truncated scheduling policy was proposed to satisfy the hard bandwidth constraint. The work in [10] presented two multi-source information update problems in a practical IoT system, called AoI-aware Multi-Source Information Updating (AoI-MSIU) and AoI-Reduction-aware Multi-Source Information Updating (AoIR-MSIU) problems,respectively. A wireless broadcast network with random arrivals was considered in [11], where two offline and two online scheduling algorithms were proposed, leveraging Markov decision process (MDP) techniques and the ...

show abstract

Age of Information Aware VNF Scheduling in Industrial IoT Using Deep Reinforcement Learning

Cited by 49 publications

References 38 publications

Latency Optimization for Blockchain-Empowered Federated Learning in Multi-Server Edge Computing

Latency Optimization for Blockchain-Empowered Federated Learning in Multi-Server Edge Computing

Multi-Agent Deep Reinforcement Learning for Cost- and Delay-Sensitive Virtual Network Function Placement and Routing

How to Minimize the Weighted Sum AoI in Multi-Source Status Update Systems: OMA or NOMA?

Contact Info

Product

Resources

About