Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience Replay

Kim, Myeong Seop; Han, Do Hung; Park, Jae‐Han; Kim, Jung Su

doi:10.3390/app10020575

Cited by 69 publications

(29 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The performance of the proposed SAC-based path planning is validated not only simulation but also experiment using two real open manipulators. The results show that the proposed method finds a shorter and smoother path for most scenarios due to enhanced exploration performance by SAC, and outperforms over the existing results such as PRM [ 29 ] and TD3 (Twin Delayed Deep Deterministic Policy Gradient)-based path planning [ 30 ].…”

Section: Introductionmentioning

confidence: 89%

Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor–Critic with Hindsight Experience Replay

Prianto

Kim

Park

et al. 2020

Sensors

Self Cite

View full text Add to dashboard Cite

Since path planning for multi-arm manipulators is a complicated high-dimensional problem, effective and fast path generation is not easy for the arbitrarily given start and goal locations of the end effector. Especially, when it comes to deep reinforcement learning-based path planning, high-dimensionality makes it difficult for existing reinforcement learning-based methods to have efficient exploration which is crucial for successful training. The recently proposed soft actor–critic (SAC) is well known to have good exploration ability due to the use of the entropy term in the objective function. Motivated by this, in this paper, a SAC-based path planning algorithm is proposed. The hindsight experience replay (HER) is also employed for sample efficiency and configuration space augmentation is used in order to deal with complicated configuration space of the multi-arms. To show the effectiveness of the proposed algorithm, both simulation and experiment results are given. By comparing with existing results, it is demonstrated that the proposed method outperforms the existing results.

show abstract

Section: Introductionmentioning

confidence: 89%

Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor–Critic with Hindsight Experience Replay

Prianto

Kim

Park

et al. 2020

Sensors

Self Cite

View full text Add to dashboard Cite

show abstract

“…In future research, to successfully reach the target point, the motion planning algorithm must be improved via reinforcement learning. For the robots to execute reinforcement learning based on earned rewards from several repeated several trials and errors, the manipulator should be able to determine the optimal trajectory by itself [30]. Also the manipulators decide the order to target fruit according to produced optimal trajectory based on reinforcement learning.…”

Section: B Reinforcement Learning-based Path Planningmentioning

confidence: 99%

Towards an Efficient Tomato Harvesting Robot: 3D Perception, Manipulation, and End-Effector

et al. 2021

View full text Add to dashboard Cite

Fruit and vegetable harvesting robots have been widely studied and developed in recent years. However, despite extensive research commercial tomato harvesting robots still remain a challenge. In this paper, we propose an efficient tomato harvesting robot that combines the principle of 3D perception, Manipulation, and an End-effector. For this robot, tomatoes are detected based on deep learning, after which 3D coordinates of the target crop are extracted and motion control of the manipulator based on 3D coordination. In addition, a suction pad featuring the kirigami pattern, which is a part of the suction gripper, was developed to grip individual tomatoes in clusters. A scissor-shaped cutting module with an assist unit, which is used to overcome structural limitations and implement effective cutting, was also desinged and tested. The proposed tomato harvesting robot was validated and evaluated on a laboratory testbed basd on the performance of each component. Therefore, in this study, we propose and verify a new robot design for the effective harvesting of tomatoes. INDEX TERMSHarvesting robot, end-effector, 3D perception, tractional cutting unit.

show abstract

“…The achievements of the research on multi-agent dynamic task allocation [30][31][32] are mainly based on heuristic intelligent algorithms. Intelligent algorithms mainly use environmental learning or heuristic search, such as A* algorithms [33], evolutionary algorithms [34][35][36], and neural network-based methods, etc.…”

Section: Algorithms For Task Allocationmentioning

confidence: 99%

A Novel Hierarchical Soft Actor-Critic Algorithm for Multi-Logistics Robots Task Allocation

et al. 2021

View full text Add to dashboard Cite

In intelligent unmanned warehouse goods-to-man systems, the allocation of tasks has an important influence on the efficiency because of the dynamic performance of AGV robots and orders. The paper presents a hierarchical Soft Actor-Critic algorithm to solve the dynamic scheduling problem of orders picking. The method proposed is based on the classic Soft Actor-Critic and hierarchical reinforcement learning algorithm. In this paper, the model is trained at different time scales by introducing sub-goals, with the top-level learning a policy and the bottom level learning a policy to achieve the sub-goals. The actor of the controller aims to maximize expected intrinsic reward while also maximizing entropy. That is, to succeed at the subgoals while moving as randomly as possible. Finally, experimental results for simulation experiments in different scenes show that the method can make multi-logistics AGV robots work together and improves the reward in sparse environments about 2.61 times compared to the SAC algorithm. INDEX TERMS multi-logistics robot; task allocation; deep reinforcement learning; Actor-Critic; hierarchical reinforcement learning

show abstract

Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience Replay

Cited by 69 publications

References 22 publications

Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor–Critic with Hindsight Experience Replay

Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor–Critic with Hindsight Experience Replay

Towards an Efficient Tomato Harvesting Robot: 3D Perception, Manipulation, and End-Effector

A Novel Hierarchical Soft Actor-Critic Algorithm for Multi-Logistics Robots Task Allocation

Contact Info

Product

Resources

About