Target-driven visual navigation in indoor scenes using deep reinforcement learning

Zhu, Yuke; Mottaghi, Roozbeh; Kolve, Eric; Lim, Joseph J.; Gupta, Abhinav; Li, Feifei; Farhadi, Ali

doi:10.1109/icra.2017.7989381

Cited by 1,288 publications

(1,116 citation statements)

References 41 publications

Supporting

Mentioning

1,060

Contrasting

Unclassified

Order By: Relevance

“…Another promising approach has been use simulations to get training data for reinforcement or imitation learning tasks, while testing the learned policy in the real world [14], [15], [11]. Clearly, this approach suffers from the domain shift between simulation and reality and might require some realworld data to be able to generalize [11].…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

DroNet: Learning to Fly by Driving

Loquercio

Maqueda

del-Blanco

et al. 2018

IEEE Robot. Autom. Lett.

394

315

View full text Add to dashboard Cite

Civilian drones are soon expected to be used in a wide variety of tasks, such as aerial surveillance, delivery, or monitoring of existing architectures. Nevertheless, their deployment in urban environments has so far been limited. Indeed, in unstructured and highly dynamic scenarios, drones face numerous challenges to navigate autonomously in a feasible and safe way. In contrast to traditional "map-localize-plan" methods, this letter explores a data-driven approach to cope with the above challenges. To accomplish this, we propose DroNet: a convolutional neural network that can safely drive a drone through the streets of a city. Designed as a fast eight-layers residual network, DroNet produces two outputs for each single input image: A steering angle to keep the drone navigating while avoiding obstacles, and a collision probability to let the UAV recognize dangerous situations and promptly react to them. The challenge is however to collect enough data in an unstructured outdoor environment such as a city. Clearly, having an expert pilot providing training trajectories is not an option given the large amount of data required and, above all, the risk that it involves for other vehicles or pedestrians moving in the streets. Therefore, we propose to train a UAV from data collected by cars and bicycles, which, already integrated into the urban environment, would not endanger other vehicles and pedestrians. Although trained on city streets from the viewpoint of urban vehicles, the navigation policy learned by DroNet is highly generalizable. Indeed, it allows a UAV to successfully fly at relative high altitudes and even in indoor environments, such as parking lots and corridors. To share our findings with the robotics community, we publicly release all our datasets, code, and trained networks. Abstract-Civilian drones are soon expected to be used in a wide variety of tasks, such as aerial surveillance, delivery, or monitoring of existing architectures. Nevertheless, their deployment in urban environments has so far been limited. Indeed, in unstructured and highly dynamic scenarios, drones face numerous challenges to navigate autonomously in a feasible and safe way. In contrast to traditional "map-localize-plan" methods, this paper explores a data-driven approach to cope with the above challenges. To accomplish this, we propose DroNet: a convolutional neural network that can safely drive a drone through the streets of a city. Designed as a fast 8-layers residual network, DroNet produces two outputs for each single input image: a steering angle to keep the drone navigating while avoiding obstacles, and a collision probability to let the UAV recognize dangerous situations and promptly react to them. The challenge is however to collect enough data in an unstructured outdoor environment such as a city. Clearly, having an expert pilot providing training trajectories is not an option given the large amount of data required and, above all, the risk that it involves for other vehicles or pedestrians moving in the stree...

show abstract

Section: Related Workmentioning

confidence: 99%

“…These methodologies can be divided into two main categories: (i) methods based on reinforcement learning (RL) [7], [11] and (ii) methods based on supervised learning [6], [12], [9], [10], [13].…”

Section: Related Workmentioning

confidence: 99%

DroNet: Learning to Fly by Driving

Loquercio

Maqueda

del-Blanco

et al. 2018

IEEE Robot. Autom. Lett.

394

315

View full text Add to dashboard Cite

show abstract

“…Asynchronous parallel sampling is applied to a number of deep reinforcement learning algorithms including DQN and an actor-critic method (called A3C) which achieved state of the art results on the Atari benchmark and demonstrated effective performance in a navigation task in a 3D simulated environment. Utilizing additional information about the target is used to speed up and improve deep RL in [11]. Information about the target is provided in the form of images of the target.…”

Section: Related Workmentioning

confidence: 99%

“…For comparison, in our experiments on a 30x30 Grid, the shortest route to the target consists of 57 actions. While a recent study proposed a new complex simulator with realistic graphics [11] where the shortest trajectory to the target is typically less than 20 actions. …”

Section: A Grid Navigation Taskmentioning

confidence: 99%