Model-aided Deep Reinforcement Learning for Sample-efficient UAV Trajectory Design in IoT Networks

Esrafilian, Omid; Bayerlein, Harald; Gesbert, David

doi:10.48550/arxiv.2104.10403

Cited by 2 publications

(2 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…‚ Model-aided Deep Reinforcement Learning (Q Learning) [31] ‚ MARL with Deep Q-Learning [32] approaches, RL and deep RL have been used for UAV policy navigation because of their ability to learn directly by interactions with the surrounding environments [24], [39]- [43]. When the environment has a grid-world representation (e.g., indoors), Q-learning represents a simple and optimal solution because state-action pairs can be represented by a tractable Q-table that is updated at each time instant according to the received rewards [4], [14], [44].…”

Section: Applications Optimization Objective Techniquesmentioning

confidence: 99%

Reinforcement Learning for Joint Detection and Mapping Using Dynamic UAV Networks

Guerra,

Guidi,

Dardari

et al. 2024

IEEE Trans. Aerosp. Electron. Syst.

View full text Add to dashboard Cite

Dynamic radar networks, usually composed of flying unmanned aerial vehicles (UAVs), have recently attracted great interest for time-critical applications, such as searchand-rescue operations, involving reliable detection of multiple targets and situational awareness through environment radio mapping. Unfortunately, the time available for detection is often limited, and in most settings, there are no reliable models of the environment, which should be learned quickly. One possibility to guarantee short learning time is to enhance cooperation among UAVs. For example, they can share information for properly navigating the environment if they have a common goal. Alternatively, in case of multiple and different goals or tasks, they can exchange their available information to fitly assign tasks (e.g., targets) to each network agent. In this paper, we consider ad-hoc approaches for task assignment and a multiagent reinforcement learning (RL) algorithm that allow the UAVs to learn a suitable navigation policy to explore an unknown environment while maximizing the accuracy in detecting targets. The obtained results demonstrate that cooperation at different levels accelerates the learning process and brings benefits in accomplishing the team's goals.

show abstract

Section: Applications Optimization Objective Techniquesmentioning

confidence: 99%

Reinforcement Learning for Joint Detection and Mapping Using Dynamic UAV Networks

Guerra,

Guidi,

Dardari

et al. 2024

IEEE Trans. Aerosp. Electron. Syst.

View full text Add to dashboard Cite

show abstract

“…Until now, very few studies have been reported to deal exclusively with the UAV trajectory design problem by considering the partially observed networking environment and unexpected mobility/locations of the IoT devices. In [20], the authors introduced a model-based deep reinforcement learning (DRL) UAV path planning algorithm for data collection, where a device localization mechanism was used by dividing the ground nodes into either known or unknown locations. Nonetheless, they made the assumption that the UAVs are given predetermined targets and the IoT nodes are static with complete location information.…”

Section: A Literature Reviewmentioning

confidence: 99%

On collaborative multi-UAV trajectory planning for data collection

Rahim,

Peng,

Chang

et al. 2023

J. Commun. Netw.

View full text Add to dashboard Cite

This paper investigates the scenario of the Internet of things (IoT) data collection via multiple unmanned aerial vehicles (UAVs), where a novel collaborative multi-agent trajectory planning and data collection (CMA-TD) algorithm is introduced for online obtaining the trajectories of the multiple UAVs without any prior knowledge of the sensor locations. We first provide two integer linear programs (ILPs) for the considered system by taking the coverage and the total power usage as the optimization targets. As a complement to the ILPs and to avoid intractable computation, the proposed CMA-TD algorithm can effectively solve the formulated problem via a deep reinforcement learning (DRL) process on a double deep Q-learning network (DDQN). Extensive simulations are conducted to verify the performance of the proposed CMA-TD algorithm and compare it with a couple of state-of-the-art counterparts in terms of the amount of served IoT nodes, energy consumption, and utilization rates.

show abstract

Model-aided Deep Reinforcement Learning for Sample-efficient UAV Trajectory Design in IoT Networks

Cited by 2 publications

References 11 publications

Reinforcement Learning for Joint Detection and Mapping Using Dynamic UAV Networks

Reinforcement Learning for Joint Detection and Mapping Using Dynamic UAV Networks

On collaborative multi-UAV trajectory planning for data collection

Contact Info

Product

Resources

About