Deep Deterministic Policy Gradient-Based Autonomous Driving for Mobile Robots in Sparse Reward Environments

Park, Minjae; Lee, Seok-Young; Hong, Jin; Kwon, Nam Kyu

doi:10.3390/s22249574

Cited by 10 publications

(11 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The network model outputs continuous linear

and angular

velocities to control the robot’s forward movement and steering. The

and the previous action

serve as state input, enabling the neural network to gauge the robot’s speed and its distance from the target [ 34 ]. It is pertinent to highlight that

represents the Euclidean distance between the target coordinates

and the current robot coordinates

, which is defined as follows:

…”

Section: Methodsmentioning

confidence: 99%

Inspection Robot Navigation Based on Improved TD3 Algorithm

Huang,

Xie,

Yan

2024

Sensors

View full text Add to dashboard Cite

The swift advancements in robotics have rendered navigation an essential task for mobile robots. While map-based navigation methods depend on global environmental maps for decision-making, their efficacy in unfamiliar or dynamic settings falls short. Current deep reinforcement learning navigation strategies can navigate successfully without pre-existing map data, yet they grapple with issues like inefficient training, slow convergence, and infrequent rewards. To tackle these challenges, this study introduces an improved two-delay depth deterministic policy gradient algorithm (LP-TD3) for local planning navigation. Initially, the integration of the long–short-term memory (LSTM) module with the Prioritized Experience Re-play (PER) mechanism into the existing TD3 framework was performed to optimize training and improve the efficiency of experience data utilization. Furthermore, the incorporation of an Intrinsic Curiosity Module (ICM) merges intrinsic with extrinsic rewards to tackle sparse reward problems and enhance exploratory behavior. Experimental evaluations using ROS and Gazebo simulators demonstrate that the proposed method outperforms the original on various performance metrics.

show abstract

“…The network model outputs continuous linear

and angular

velocities to control the robot’s forward movement and steering. The

and the previous action

serve as state input, enabling the neural network to gauge the robot’s speed and its distance from the target [ 34 ]. It is pertinent to highlight that

represents the Euclidean distance between the target coordinates

and the current robot coordinates

, which is defined as follows:

…”

Section: Methodsmentioning

confidence: 99%

Inspection Robot Navigation Based on Improved TD3 Algorithm

Huang,

Xie,

Yan

2024

Sensors

View full text Add to dashboard Cite

show abstract

“…They test and validate their approach using the CARLA simulator [25]. However, Park et al, apply the deep deterministic policy gradient (DDPG) path-planning method for mobile robots using Gazebo simulator [26].…”

Section: Literature Reviewmentioning

confidence: 99%

Implementing Deep Reinforcement Learning in Autonomous Control Systems

Noureldin Ragheb,

Mervat M. A. Mahmoud

2024

ARASET

View full text Add to dashboard Cite

Developing a safe and reliable autonomous vehicle has been a significant focus in recent years. Supervised learning methods require large amounts of labelled data for training, making it expensive. The performance of these agents is limited to the data provided in training and the inability to generalize performance in different environments. In addition, some driving situations, such as near-accident scenarios, are difficult to cover in the training data. As a result, the autonomous driving agent may behave unexpectedly in safety-critical situations, making it unreliable for safe transportation. Reinforcement learning is a potential solution for these issues. This research paper explores the potential of applying deep reinforcement learning techniques to autonomous driving, with a spotlight on comparing two popular deep reinforcement learning algorithms: Deep Q-Network (DQN) and Deep Deterministic Policy Gradient (DDPG). The study uses the CARLA simulator, which provides a realistic environment and conditions for testing autonomous driving algorithms. The study finds that DDPG outperforms DQN regarding average reward, but DQN performs better regarding collision rate.

show abstract

“…The technology applied in this study was previously introduced in the authors’ earlier work [ 23 ]. In [ 23 ], a method integrating the HER technique to assist in finding the optimal policy was proposed and demonstrated its effectiveness in both simulation and real-world environments without obstacles.…”

Section: Preliminariesmentioning

confidence: 99%

“…The agent was trained in a simple driving environment within the simulation. We demonstrated that the proposed method operates effectively in both simulated and actual environments [ 23 ]. The HER has also been widely applied in the fields of mobile robotics and robot arm control [ 24 , 25 , 26 , 27 , 28 ].…”

Section: Introductionmentioning

confidence: 99%

Autonomous Driving of Mobile Robots in Dynamic Environments Based on Deep Deterministic Policy Gradient: Reward Shaping and Hindsight Experience Replay

Park,

Kwon

2024

Biomimetics

Self Cite

View full text Add to dashboard Cite

In this paper, we propose a reinforcement learning-based end-to-end learning method for the autonomous driving of a mobile robot in a dynamic environment with obstacles. Applying two additional techniques for reinforcement learning simultaneously helps the mobile robot in finding an optimal policy to reach the destination without collisions. First, the multifunctional reward-shaping technique guides the agent toward the goal by utilizing information about the destination and obstacles. Next, employing the hindsight experience replay technique to address the experience imbalance caused by the sparse reward problem assists the agent in finding the optimal policy. We validated the proposed technique in both simulation and real-world environments. To assess the effectiveness of the proposed method, we compared experiments for five different cases.

show abstract

Deep Deterministic Policy Gradient-Based Autonomous Driving for Mobile Robots in Sparse Reward Environments

Cited by 10 publications

References 33 publications

Inspection Robot Navigation Based on Improved TD3 Algorithm

Inspection Robot Navigation Based on Improved TD3 Algorithm

Implementing Deep Reinforcement Learning in Autonomous Control Systems

Autonomous Driving of Mobile Robots in Dynamic Environments Based on Deep Deterministic Policy Gradient: Reward Shaping and Hindsight Experience Replay

Contact Info

Product

Resources

About