UAV Target Tracking in Urban Environments Using Deep Reinforcement Learning

Bhagat, Sarthak; Sujit, P. B.

doi:10.1109/icuas48674.2020.9213856

Cited by 40 publications

(26 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Next, the optimal reward function that minimizes the trajectory tracking error was found, and a reinforcement learning-based controller using this reward function was proposed. In the work of [39], Target Following DQN (TF-DQN), a deep reinforcement learning technique based on DQNs was proposed with a curriculum training framework for the UAV to persistently track the target in the presence of obstacles and target motion uncertainty. For the reward function, a piecewise reward was proposed to enable different rewards according to the status of the collision compared with the noncollision.…”

Section: Literature Surveymentioning

confidence: 99%

Twin Delayed Deep Deterministic Policy Gradient-Based Target Tracking for Unmanned Aerial Vehicle With Achievement Rewarding and Multistage Training

et al. 2022

View full text Add to dashboard Cite

show abstract

Section: Literature Surveymentioning

confidence: 99%

Twin Delayed Deep Deterministic Policy Gradient-Based Target Tracking for Unmanned Aerial Vehicle With Achievement Rewarding and Multistage Training

et al. 2022

View full text Add to dashboard Cite

show abstract

“…The framework consists of a global planner based on a modern online Partially Observable Markov Decision Process (POMDP) solver and a local continuous-environment exploration controller based on a DRL method. In [ 25 ], the authors proposed a target following method based on deep Q-networks, considering visibility obstruction from obstacles and uncertain target motion. In [ 26 ], the authors proposed a DRL-based method to enable a robot to explore unknown cluttered urban environments, in which a deep network with convolutional neural network (CNN) [ 27 ] was trained by asynchronous advantage actor-critic (A3C) approach to generate appropriate frontier locations.…”

Section: Introductionmentioning

confidence: 99%

Searching and Tracking an Unknown Number of Targets: A Learning-Based Method Enhanced with Maps Merging

Peng

Jia

Bai

2021

Sensors

View full text Add to dashboard Cite

Unmanned aerial vehicles (UAVs) have been widely used in search and rescue (SAR) missions due to their high flexibility. A key problem in SAR missions is to search and track moving targets in an area of interest. In this paper, we focus on the problem of Cooperative Multi-UAV Observation of Multiple Moving Targets (CMUOMMT). In contrast to the existing literature, we not only optimize the average observation rate of the discovered targets, but we also emphasize the fairness of the observation of the discovered targets and the continuous exploration of the undiscovered targets, under the assumption that the total number of targets is unknown. To achieve this objective, a deep reinforcement learning (DRL)-based method is proposed under the Partially Observable Markov Decision Process (POMDP) framework, where each UAV maintains four observation history maps, and maps from different UAVs within a communication range can be merged to enhance UAVs’ awareness of the environment. A deep convolutional neural network (CNN) is used to process the merged maps and generate the control commands to UAVs. The simulation results show that our policy can enable UAVs to balance between giving the discovered targets a fair observation and exploring the search region compared with other methods.

show abstract

“…Prior work on target tracking using UAVs has extensively covered a diverse range of training strategies and formulations in the last two decades. One of the most prevailing co-herent approaches to target tracking is via the development of guidance laws [Wise and Rysdyk, 2006;Choi and Kim, 2014;Oh et al, 2013;Regina and Zanzi, 2011;Chen et al, 2009;Theodorakopoulos and Lacroix, 2008;Pothen and Ratnoo, 2017]. In principle, the motion model for the target should be known apriori in order to design these laws that satisfy the FOV constraints.…”

Section: Introductionmentioning

confidence: 99%

“…In this paper, we also aim to utilize a simple reinforcement learning technique, modifying it for our specific task. For this, we extend a deep reinforcement learning approach called Target-Following Deep Q-Network (TF-DQN) [Bhagat and Sujit, 2020] to a Double DQN [Hasselt et al, 2016] that we refer to as TF-DDQN. We also propose a target tracking evaluation scheme that can be utilized to quantify the performance of any given target tracking algorithm based on factors like deviation from target's trajectory, proximity to checkpoints placed along the target's trajectory, and computational resources required for training and evaluation.…”

Section: Introductionmentioning

confidence: 99%

“…• We propose TF-DDQN, a DDQN-based extension over the current state-of-the-art in target tracking TF-DQN [Bhagat and Sujit, 2020], wherein we isolate the value estimation and evaluation steps in order to learn a more robust policy. • We propose a standardised target tracking evaluation scheme that aids in benchmarking the performance of different algorithms over a diversified set of parameters.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Target-Following Double Deep Q-Networks for UAVs

Bhagat,

Sujit

2021

Preprint

Self Cite

View full text Add to dashboard Cite

Target tracking in unknown real-world environments in the presence of obstacles and target motion uncertainty demand agents to develop an intrinsic understanding of the environment in order to predict the suitable actions to be taken at each time step. This task requires the agents to maximize the visibility of the mobile target maneuvering randomly in a network of roads by learning a policy that takes into consideration the various aspects of a real-world environment. In this paper, we propose a DDQN-based extension to the state-ofthe-art in target tracking using a UAV TF-DQN, that we call TF-DDQN, that isolates the value estimation and evaluation steps. Additionally, in order to carefully benchmark the performance of any given target tracking algorithm, we introduce a novel target tracking evaluation scheme that quantifies its efficacy in terms of a wide set of diverse parameters. To replicate the real-world setting, we test our approach against standard baselines for the task of target tracking in complex environments with varying drift conditions and changes in environmental configuration.

show abstract

UAV Target Tracking in Urban Environments Using Deep Reinforcement Learning

Cited by 40 publications

References 17 publications

Twin Delayed Deep Deterministic Policy Gradient-Based Target Tracking for Unmanned Aerial Vehicle With Achievement Rewarding and Multistage Training

Twin Delayed Deep Deterministic Policy Gradient-Based Target Tracking for Unmanned Aerial Vehicle With Achievement Rewarding and Multistage Training

Searching and Tracking an Unknown Number of Targets: A Learning-Based Method Enhanced with Maps Merging

Target-Following Double Deep Q-Networks for UAVs

Contact Info

Product

Resources

About