Collision-free path planning for a guava-harvesting robot based on recurrent deep reinforcement learning

Lin, Guimin; Zhu, Liehuang; Li, Jinhui; Zou, Xiangjun; Tang, Yunchao

doi:10.1016/j.compag.2021.106350

Cited by 87 publications

(49 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Combining two optimization methods also has been studied [39]. Recently, with the development of deep learning, studies on the path planning using the RL have mainly been proposed [3], [6], [7], [9], [10], [11], [14], [15], [16], [17], [40], [41], [42]. They have supposed the specific scenario and set an environment to apply the agent in the path planning.…”

Section: Path Planningmentioning

confidence: 99%

“…It has been widely used in various fields such as robotics [1], [2], [3], drone [4], [5], [6], [7], [8], [9], military service [10], [11], and self-driving car [12], [13]. Recently, reinforcement learning (RL) has been mainly studied for the path planning [3], [7], [9], [10], [11], [14], [15], [16], [17]. To get an optimal solution, it is essential to give enough reward for an agent to reach the goal and to set up a specific environment.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A Fully Controllable Agent in the Path Planning using Goal-Conditioned Reinforcement Learning

Lee¹

2022

Preprint

View full text Add to dashboard Cite

The aim of path planning is to reach the goal from starting point by searching for an agent's route. In the path planning, the routes may vary depending on the number of variables such that it is important for the agent to reach various goals. Numerous studies, however, have dealt with a single goal that is predefined by the user. In the present study, I propose a novel reinforcement learning framework for a fully controllable agent in the path planning. To do this, I propose a bi-directional memory editing to obtain various bi-directional trajectories of the agent, in which the agent's behavior and sub-goals are trained on the goal-conditioned RL. As for the agent's to move in various directions, I utilize the sub-goals dedicated network, separated from a policy network. Lastly, I present the reward shaping to shorten the number of steps for the agent to reach the goal. In the experimental result, the agent was able to reach the various goals that have never been visited by the agent in the training. We confirmed that the agent could perform difficult missions such as a round trip and the agent used the shorter route with the reward shaping.

show abstract

Section: Path Planningmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

A Fully Controllable Agent in the Path Planning using Goal-Conditioned Reinforcement Learning

Lee¹

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…r 2 is the decay rate, which is a negative constant. The normalized process is usually used in the reward function design [45][46][47], and is calculated by I w /I n . I w is the IAE value obtained by subtracting the command speed and the measured output speed.…”

Section: Reward Function Designmentioning

confidence: 99%

PMSM Speed Control Based on Particle Swarm Optimization and Deep Deterministic Policy Gradient under Load Disturbance

et al. 2021

View full text Add to dashboard Cite

Proportional integral-based particle swarm optimization (PSO) and deep deterministic policy gradient (DDPG) algorithms are applied to a permanent-magnet synchronous motor to track speed control. The proposed methods, based on notebooks, can deal with time delay challenges, imprecise mathematical models, and unknown disturbance loads. First, a system identification method is used to obtain an approximate model of the motor. The load and speed estimation equations can be determined using the model. By adding the estimation equations, the PSO algorithm can determine the sub-optimized parameters of the proportional-integral controller using the predicted speed response; however, the computational time and consistency challenges of the PSO algorithm are extremely dependent on the number of particles and iterations. Hence, an online-learning method, DDPG, combined with the PSO algorithm is proposed to improve the speed control performance. Finally, the proposed methods are implemented on a real platform, and the experimental results are presented and discussed.

show abstract

“…The current path planning algorithms mainly include the colony algorithms (Liu et al, 2019;Ye et al, 2020;Zhang et al, 2020;Zhu et al, 2020), PSO (Krell et al, 2019;Wang Y. B. et al, 2019;Liu X. H. et al, 2021;Song et al, 2021), A * algorithms (Xiong et al, 2020;Tang et al, 2021;Tullu et al, 2021), artificial potential field methods Azmi and Ito, 2020;Song et al, 2020;Yao et al, 2020), genetic algorithms (Hao et al, 2020;Li K. R. et al, 2021;Wen et al, 2021), fuzzy control algorithms (Guo et al, 2020;Zhi and Jiang, 2020), fast marching algorithms (Sun et al, 2021;Wang et al, 2021;Xu et al, 2021), and deep reinforcement learning algorithms (Li L. Y. et al, 2021;Lin et al, 2021;Xie et al, 2021). PSO is an evolutionary computation algorithm that can be used to find the optimal solution through collaboration and information sharing between individuals in the group, as in path planning, the optimal solution is to find the shortest path.…”

Section: Introductionmentioning

confidence: 99%

An Improved PSO-GWO Algorithm With Chaos and Adaptive Inertial Weight for Robot Path Planning

Cheng

Zheng³

et al. 2021

Front. Neurorobot.

View full text Add to dashboard Cite

The traditional particle swarm optimization (PSO) path planning algorithm represents each particle as a path and evolves the particles to find an optimal path. However, there are problems in premature convergence, poor global search ability, and to the ease in which particles fall into the local optimum, which could lead to the failure of fast optimal path obtainment. In order to solve these problems, this paper proposes an improved PSO combined gray wolf optimization (IPSO-GWO) algorithm with chaos and a new adaptive inertial weight. The gray wolf optimizer can sort the particles during evolution to find the particles with optimal fitness value, and lead other particles to search for the position of the particle with the optimal fitness value, which gives the PSO algorithm higher global search capability. The chaos can be used to initialize the speed and position of the particles, which can reduce the prematurity and increase the diversity of the particles. The new adaptive inertial weight is designed to improve the global search capability and convergence speed. In addition, when the algorithm falls into a local optimum, the position of the particle with the historical best fitness can be found through the chaotic sequence, which can randomly replace a particle to make it jump out of the local optimum. The proposed IPSO-GWO algorithm is first tested by function optimization using ten benchmark functions and then applied for optimal robot path planning in a simulated environment. Simulation results show that the proposed IPSO-GWO is able to find an optimal path much faster than traditional PSO-GWO based methods.

show abstract

Collision-free path planning for a guava-harvesting robot based on recurrent deep reinforcement learning

Cited by 87 publications

References 15 publications

A Fully Controllable Agent in the Path Planning using Goal-Conditioned Reinforcement Learning

A Fully Controllable Agent in the Path Planning using Goal-Conditioned Reinforcement Learning

PMSM Speed Control Based on Particle Swarm Optimization and Deep Deterministic Policy Gradient under Load Disturbance

An Improved PSO-GWO Algorithm With Chaos and Adaptive Inertial Weight for Robot Path Planning

Contact Info

Product

Resources

About