Deep Reinforcement Learning for End-to-End Local Motion Planning of Autonomous Aerial Robots in Unknown Outdoor Environments: Real-Time Flight Experiments

Doukhi, Oualid; Lee, Deok Jin

doi:10.3390/s21072534

Cited by 23 publications

(12 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…proposed a simple and efficient energy-based method [ 34 ] to prioritize playback of “posterior experience”, innovatively using “trace energy” instead of TD-error as a measure of priority. In tasks such as continuous control [ 35 ], an actor-critic reinforcement learning algorithm is used to achieve autonomous navigation and obstacle avoidance of UAVs, and robustness to unknown environments through localization noise. Junjie Zeng et al.…”

Section: Related Workmentioning

confidence: 99%

A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework

Yin

Chen

Liu

et al. 2023

Sensors

View full text Add to dashboard Cite

The key module for autonomous mobile robots is path planning and obstacle avoidance. Global path planning based on known maps has been effectively achieved. Local path planning in unknown dynamic environments is still very challenging due to the lack of detailed environmental information and unpredictability. This paper proposes an end-to-end local path planner n-step dueling double DQN with reward-based ϵ-greedy (RND3QN) based on a deep reinforcement learning framework, which acquires environmental data from LiDAR as input and uses a neural network to fit Q-values to output the corresponding discrete actions. The bias is reduced using n-step bootstrapping based on deep Q-network (DQN). The ϵ-greedy exploration-exploitation strategy is improved with the reward value as a measure of exploration, and an auxiliary reward function is introduced to increase the reward distribution of the sparse reward environment. Simulation experiments are conducted on the gazebo to test the algorithm’s effectiveness. The experimental data demonstrate that the average total reward value of RND3QN is higher than that of algorithms such as dueling double DQN (D3QN), and the success rates are increased by 174%, 65%, and 61% over D3QN on three stages, respectively. We experimented on the turtlebot3 waffle pi robot, and the strategies learned from the simulation can be effectively transferred to the real robot.

show abstract

Section: Related Workmentioning

confidence: 99%

A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework

Yin

Chen

Liu

et al. 2023

Sensors

View full text Add to dashboard Cite

show abstract

“…Liu et al [19] used RL and probability map to create a search algorithm and improve the detection ability of the algorithm. Finally, deep RL is used for local motion planning in an unknown environment in [20], and for trajectory tracking and altitude control in [21].…”

Section: Related Workmentioning

confidence: 99%

“…Further, compared to works such as [18] where a Deep Q-Network capable of generating discrete action is used, in our work, a policy gradient-based reinforcement learning algorithm capable of generating continuous action is used. Compared to another similar work [20], while we use RGB-D data as the input to our Deep RL and generate 3D-action, they use 2D lidar data and generates 2D action. Further, our goal is an image, and their goal is a point fed to the algorithm.…”

Section: Related Workmentioning

confidence: 99%

A Multi-Objective Reinforcement Learning Based Controller for Autonomous Navigation in Challenging Environments

Dooraki

Lee

2022

Machines

Self Cite

View full text Add to dashboard Cite

In this paper, we introduce a self-trained controller for autonomous navigation in static and dynamic (with moving walls and nets) challenging environments (including trees, nets, windows, and pipe) using deep reinforcement learning, simultaneously trained using multiple rewards. We train our RL algorithm in a multi-objective way. Our algorithm learns to generate continuous action for controlling the UAV. Our algorithm aims to generate waypoints for the UAV in such a way as to reach a goal area (shown by an RGB image) while avoiding static and dynamic obstacles. In this text, we use the RGB-D image as the input for the algorithm, and it learns to control the UAV in 3-DoF (x, y, and z). We train our robot in environments simulated by Gazebo sim. For communication between our algorithm and the simulated environments, we use the robot operating system. Finally, we visualize the trajectories generated by our trained algorithms using several methods and illustrate our results that clearly show our algorithm’s capability in learning to maximize the defined multi-objective reward.

show abstract

“…Kong et al [19] explored the generalization of various DRL algorithm by training them with different (but not unseen) environments. Doukui et al [20] tackle this issue by mapping exteroceptive sensors, robot state, and goal information to continuous velocity control inputs, but their exploration was only tested on unseen targets instead of unseen scenes.…”

Section: Introductionmentioning

confidence: 99%

Smooth Trajectory Collision Avoidance through Deep Reinforcement Learning

Song¹,

Saunders²,

Yue³

et al. 2022

Preprint

View full text Add to dashboard Cite

Collision avoidance is a crucial task in vision-guided autonomous navigation. Solutions based on deep reinforcement learning (DRL) has become increasingly popular. In this work, we proposed several novel agent state and reward function designs to tackle two critical issues in DRL-based navigation solutions: 1) smoothness of the trained flight trajectories; and 2) model generalization to handle unseen environments.Formulated under a DRL framework, our model relies on margin reward and smoothness constraints to ensure UAVs fly smoothly while greatly reducing the chance of collision. The proposed smoothness reward minimizes a combination of firstorder and second-order derivatives of flight trajectories, which can also drive the points to be evenly distributed, leading to stable flight speed. To enhance the agent's capability of handling new unseen environments, two practical setups are proposed to improve the invariance of both the state and reward function when deploying in different scenes. Experiments demonstrate the effectiveness of our overall design and individual components.

show abstract

Deep Reinforcement Learning for End-to-End Local Motion Planning of Autonomous Aerial Robots in Unknown Outdoor Environments: Real-Time Flight Experiments

Cited by 23 publications

References 29 publications

A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework

A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework

A Multi-Objective Reinforcement Learning Based Controller for Autonomous Navigation in Challenging Environments

Smooth Trajectory Collision Avoidance through Deep Reinforcement Learning

Contact Info

Product

Resources

About