2022
DOI: 10.1109/tvt.2022.3144277
|View full text |Cite
|
Sign up to set email alerts
|

Deep Reinforcement Learning Approach for Joint Trajectory Design in Multi-UAV IoT Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 29 publications
(8 citation statements)
references
References 9 publications
0
7
0
Order By: Relevance
“…Their work is potentially applicable for smart farms and factories. Xu et al implemented k-means and deep reinforcement learning algorithms to optimize multi-UAV trajectory for uplink data collection in IoT networks [274]. The algorithm aimed to minimize data collection time while considering criteria such as maximum speed, maximum acceleration, collision avoidance, and UAV communication interference with promising results.…”
Section: Internet Of Things (Iot)mentioning
confidence: 99%
“…Their work is potentially applicable for smart farms and factories. Xu et al implemented k-means and deep reinforcement learning algorithms to optimize multi-UAV trajectory for uplink data collection in IoT networks [274]. The algorithm aimed to minimize data collection time while considering criteria such as maximum speed, maximum acceleration, collision avoidance, and UAV communication interference with promising results.…”
Section: Internet Of Things (Iot)mentioning
confidence: 99%
“…Given the recent advances in deep reinforcement learning (DRL), the authors of [4][5][6][7][8] proposed techniques based on DRL for resolving the above challenges for AoI-optimal transmission policies. Li et al [4] considered discretized trajectories and harnessed deep Q-network (DQN) to design their control policies.…”
Section: Introductionmentioning
confidence: 99%
“…To learn actions involving both continuous and discrete variables, Hu et al [7] combines DQN and DDPG to design UAVs trajectories for AoI minimization. To tackle multi-agent trajectory planning problems, Xu et al [8] utilized an independent agent based Q-Learning method for minimizing the mission completion time of multi-UAVaided data collection. However, the solutions in [4][5][6][7][8] learn the policy for each agent independently while treating other simultaneously-learning agents as part of the environment, which results in the non-stationary problems.…”
Section: Introductionmentioning
confidence: 99%
“…It is a promising way to apply RL algorithms to solve the UAV maneuver decision-making problem (Hu et al, 2022). Regarding long-horizon tasks, the RL agent depends on dense rewards to train and construct long-term decision sequences (Xu et al, 2022;Nagpal et al, 2020). However, reward signals are highly sparse in the UAV maneuver decision-making problem due to the lack of expert experience (Sun et al, 2021).…”
Section: Introductionmentioning
confidence: 99%