Efficient experience replay architecture for offline reinforcement learning

Zhang, Longfei; Feng, Yanghe; Wang, Rongxiao; Xu, Yue; Xu, Naifu; Liu, Zeyi; Du, Hang

doi:10.1108/ria-10-2022-0248

Cited by 8 publications

(2 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Motion planning for robotic arms usually uses either sparse rewards or dense rewards [16]. When using sparse rewards, in order to efficiently utilize the data, the recently Prioritized Experience Replay (PER) and Hindsight Experience Replay (HER) can achieve good results [17,18].…”

Section: Related Workmentioning

confidence: 99%

Robotic Manipulator in Dynamic Environment with SAC Combing Attention Mechanism and LSTM

Kuang,

Zhou

2024

Electronics

View full text Add to dashboard Cite

The motion planning task of the manipulator in a dynamic environment is relatively complex. This paper uses the improved Soft Actor Critic Algorithm (SAC) with the maximum entropy advantage as the benchmark algorithm to implement the motion planning of the manipulator. In order to solve the problem of insufficient robustness in dynamic environments and difficulty in adapting to environmental changes, it is proposed to combine Euclidean distance and distance difference to improve the accuracy of approaching the target. In addition, in order to solve the problem of non-stability and uncertainty of the input state in the dynamic environment, which leads to the inability to fully express the state information, we propose an attention network fused with Long Short-Term Memory (LSTM) to improve the SAC algorithm. We conducted simulation experiments and present the experimental results. The results prove that the use of fused neural network functions improved the success rate of approaching the target and improved the SAC algorithm at the same time, which improved the convergence speed, success rate, and avoidance capabilities of the algorithm.

show abstract

Section: Related Workmentioning

confidence: 99%

Robotic Manipulator in Dynamic Environment with SAC Combing Attention Mechanism and LSTM

Kuang,

Zhou

2024

Electronics

View full text Add to dashboard Cite

show abstract

“…The dataset’s size is a critical factor, as larger datasets offer more diverse experiences, enabling better generalization. However, the size should be balanced with computational considerations [16]. A dataset’s relevance to the target task is paramount, ensuring that task-specific features contribute to the model’s adaptability.…”

Section: Introductionmentioning

confidence: 99%

One-shot sim-to-real transfer policy for robotic assembly via reinforcement learning with visual demonstration

Xiao,

Yang,

Jiang

et al. 2024

Robotica

View full text Add to dashboard Cite

Reinforcement learning (RL) has been successfully applied to a wealth of robot manipulation tasks and continuous control problems. However, it is still limited to industrial applications and suffers from three major challenges: sample inefficiency, real data collection, and the gap between simulator and reality. In this paper, we focus on the practical application of RL for robot assembly in the real world. We apply enlightenment learning to improve the proximal policy optimization, an on-policy model-free actor-critic reinforcement learning algorithm, to train an agent in Cartesian space using the proprioceptive information. We introduce enlightenment learning incorporated via pretraining, which is beneficial to reduce the cost of policy training and improve the effectiveness of the policy. A human-like assembly trajectory is generated through a two-step method with segmenting objects by locations and iterative closest point for pretraining. We also design a sim-to-real controller to correct the error while transferring to reality. We set up the environment in the MuJoCo simulator and demonstrated the proposed method on the recently established The National Institute of Standards and Technology (NIST) gear assembly benchmark. The paper introduces a unique framework that enables a robot to learn assembly tasks efficiently using limited real-world samples by leveraging simulations and visual demonstrations. The comparative experiment results indicate that our approach surpasses other baseline methods in terms of training speed, success rate, and efficiency.

show abstract

Multimodal information bottleneck for deep reinforcement learning with multiple sensors

You,

Liu

2024

Neural Networks

View full text Add to dashboard Cite

Efficient experience replay architecture for offline reinforcement learning

Cited by 8 publications

References 19 publications

Robotic Manipulator in Dynamic Environment with SAC Combing Attention Mechanism and LSTM

Robotic Manipulator in Dynamic Environment with SAC Combing Attention Mechanism and LSTM

One-shot sim-to-real transfer policy for robotic assembly via reinforcement learning with visual demonstration

Multimodal information bottleneck for deep reinforcement learning with multiple sensors

Contact Info

Product

Resources

About