Feedback Deep Deterministic Policy Gradient With Fuzzy Reward for Robotic Multiple Peg-in-Hole Assembly Tasks

Xu, Jing; Hou, Zhimin; Wang, Wei; Xu, Bohao; Zhang, Kuangen; Chen, Ken

doi:10.1109/tii.2018.2868859

Cited by 126 publications

(50 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The deep reinforcement learning (DRL) is developed to learn a mapping in an end-to-end method and DRL methods apply the computational power of deep learning model to relaxing the curse of dimensionality in complex tasks [15]. A Deep Deterministic Policy Gradient (DDPG) is a practical RL algorithm and it learns a Q-value function and a policy [16]. The Bellman equation and the off-policy sample data are used to learn the Q-value function and the Q-value is used to train an agent to get a policy.…”

Section: Deep Reinforcement Learningmentioning

confidence: 99%

See 1 more Smart Citation

A Confrontation Decision-Making Method with Deep Reinforcement Learning and Knowledge Transfer for Multi-Agent System

2020

Symmetry

View full text Add to dashboard Cite

In this paper, deep reinforcement learning (DRL) and knowledge transfer are used to achieve the effective control of the learning agent for the confrontation in the multi-agent systems. Firstly, a multi-agent Deep Deterministic Policy Gradient (DDPG) algorithm with parameter sharing is proposed to achieve confrontation decision-making of multi-agent. In the process of training, the information of other agents is introduced to the critic network to improve the strategy of confrontation. The parameter sharing mechanism can reduce the loss of experience storage. In the DDPG algorithm, we use four neural networks to generate real-time action and Q-value function respectively and use a momentum mechanism to optimize the training process to accelerate the convergence rate for the neural network. Secondly, this paper introduces an auxiliary controller using a policy-based reinforcement learning (RL) method to achieve the assistant decision-making for the game agent. In addition, an effective reward function is used to help agents balance losses of enemies and our side. Furthermore, this paper also uses the knowledge transfer method to extend the learning model to more complex scenes and improve the generalization of the proposed confrontation model. Two confrontation decision-making experiments are designed to verify the effectiveness of the proposed method. In a small-scale task scenario, the trained agent can successfully learn to fight with the competitors and achieve a good winning rate. For large-scale confrontation scenarios, the knowledge transfer method can gradually improve the decision-making level of the learning agent.

show abstract

Section: Deep Reinforcement Learningmentioning

confidence: 99%

“…Similarly, critic networks are the Q-value function evaluation network, including the target network and current network respectively. Critic updates network parameters using the TD error, as shown in Equation (16).…”

Section: Multi-agent Ddpg Algorithm With Parameter Sharingmentioning

confidence: 99%

A Confrontation Decision-Making Method with Deep Reinforcement Learning and Knowledge Transfer for Multi-Agent System

2020

Symmetry

View full text Add to dashboard Cite

show abstract

“…Xu et al [15] proposed learning dual peg insertion by using the deep deterministic policy gradient [16] (DDPG) algorithm with a fuzzy reward system. Similarly, Fan et al [17] used DDPG combined with guided policy search (GPS) [18] to learn high-precision assembly tasks.…”

Section: Introductionmentioning

confidence: 99%

Variable Compliance Control for Robotic Peg-in-Hole Assembly: A Deep-Reinforcement-Learning Approach

et al. 2020

View full text Add to dashboard Cite

Industrial robot manipulators are playing a significant role in modern manufacturing industries. Though peg-in-hole assembly is a common industrial task that has been extensively researched, safely solving complex, high-precision assembly in an unstructured environment remains an open problem. Reinforcement-learning (RL) methods have proven to be successful in autonomously solving manipulation tasks. However, RL is still not widely adopted in real robotic systems because working with real hardware entails additional challenges, especially when using position-controlled manipulators. The main contribution of this work is a learning-based method to solve peg-in-hole tasks with hole-position uncertainty. We propose the use of an off-policy, model-free reinforcement-learning method, and we bootstraped the training speed by using several transfer-learning techniques (sim2real) and domain randomization. Our proposed learning framework for position-controlled robots was extensively evaluated in contact-rich insertion tasks in a variety of environments.

show abstract

“…[ 3 , 4 ]. In the contact tasks of industrial robots, the contact force needs to be included [ 5 , 6 , 7 , 8 , 9 ]. The actions taken by a teacher can be perceived by sensors, such as visual sensors to capture a teacher’s body movements [ 10 , 11 ] or recognize a teacher’s gestures [ 12 ], wearable sensors, and force sensors to perceive a teacher’s behavioral intentions [ 13 , 14 , 15 ].…”

Section: Introductionmentioning

confidence: 99%

“…The actions taken by a teacher can be perceived by sensors, such as visual sensors to capture a teacher’s body movements [ 10 , 11 ] or recognize a teacher’s gestures [ 12 ], wearable sensors, and force sensors to perceive a teacher’s behavioral intentions [ 13 , 14 , 15 ]. Compared with visual sensors, wearable sensors, etc., force sensor-based kinesthetic teaching is suitable for non-professionals to tell the robot the action needed to be taken in current state in a simple and intuitive way [ 5 , 6 , 7 , 13 , 14 , 15 , 16 ].…”

Section: Introductionmentioning

confidence: 99%

Development and Application of a Tandem Force Sensor

Zhang

Chen

Zhang

2020

Sensors

View full text Add to dashboard Cite

In robot teaching for contact tasks, it is necessary to not only accurately perceive the traction force exerted by hands, but also to perceive the contact force at the robot end. This paper develops a tandem force sensor to detect traction and contact forces. As a component of the tandem force sensor, a cylindrical traction force sensor is developed to detect the traction force applied by hands. Its structure is designed to be suitable for humans to operate, and the mechanical model of its cylinder-shaped elastic structural body has been analyzed. After calibration, the cylindrical traction force sensor is proven to be able to detect forces/moments with small errors. Then, a tandem force sensor is developed based on the developed cylindrical traction force sensor and a wrist force sensor. The robot teaching experiment of drawer switches were made and the results confirm that the developed traction force sensor is simple to operate and the tandem force sensor can achieve the perception of the traction and contact forces.

show abstract

Feedback Deep Deterministic Policy Gradient With Fuzzy Reward for Robotic Multiple Peg-in-Hole Assembly Tasks

Cited by 126 publications

References 14 publications

A Confrontation Decision-Making Method with Deep Reinforcement Learning and Knowledge Transfer for Multi-Agent System

A Confrontation Decision-Making Method with Deep Reinforcement Learning and Knowledge Transfer for Multi-Agent System

Variable Compliance Control for Robotic Peg-in-Hole Assembly: A Deep-Reinforcement-Learning Approach

Development and Application of a Tandem Force Sensor

Contact Info

Product

Resources

About