AbstractThe robotic assembly represents a group of benchmark problems for reinforcement learning and variable compliance control that features sophisticated contact manipulation. One of the key challenges in applying reinforcement learning to physical robot is the sample complexity, the requirement of large amounts of experience for learning. We mitigate this sample complexity problem by incorporating an iteratively refitted model into the learning process through model-guided exploration. Yet, fitting a local model of the physical environment is of major difficulties. In this work, a Kalman filter is used to combine the adaptive linear dynamics with a coarse prior model from analytical description, and proves to give more accurate predictions than the existing method. Experimental results show that the proposed model fitting strategy can be incorporated into a model predictive controller to generate good exploration behaviors for learning acceleration, while preserving the benefits of model-free reinforcement learning for uncertain environments. In addition to the sample complexity, the inevitable robot overloaded during operation also tends to limit the learning efficiency. To address this problem, we present a method to restrict the largest possible potential energy in the compliance control system and therefore keep the contact force within the legitimate range. Note to PractitionersAssembly is a labor-intensive work in manufacturing industries where automation is highly needed. Though the combination of deep reinforcement learning and variable-compliance action of robot has been shown significant robustness and adaptability to environmental change and disturbance compared with constant stiffness strategies (analog to RCC device in robotic assembly), the acquirement of such policy remains difficult even with the remarkable progress taking place in machine learning society. For skill learning of physical robotic system, the learning speed affects directly the production efficiency so that it should be carefully addressed to ensure the applicability of the learning-based method for industrial practitioners. However, the *Corresponding Author. wud@tsinghua.edu.cn reinforcement learning in physical world reveals different difficulties, mainly including the sampling efficiency and the exploration safety. In this work, we propose a new data fusion algorithm to fit local model for efficient model-guided exploration of the variable compliance policies. In addition, to ensure a safe exploration and reduce the training assistance time, a contact force restriction controller is designed and activated when the robot exceeds a pre-defined safe region. Experimental results demonstrate an improvement of policy learning speed in robotic assembly with the proposed method. Currently the prior model in local model fitting is given by human in analytic form. Although this expression is quite simple and intuitive, we hope to replace it with human demonstration to further improve the accessibility.
Purpose This paper aims to improve the accuracy of robot payload identification and decrease the complexity in its industrial application by developing a new method based on the actuator current. Design/methodology/approach Instead of previous general robot dynamic modeling of the actuators, links, together with payload inertial parameters, the paper discovers that the difference of the actuator torque between the robot moving along the same trajectory with and without carrying payload can be described as a function of the payload inertial parameters directly. Then a direct dynamic identification model of payload is built, a set of specialized novel exciting trajectories are designed for accurate identification and the least square method is applied for the estimation of the load parameters. Findings The experiments confirm the effectiveness of the proposed method in robot payload identification. The identification accuracy is greatly improved compared with that of existing methods based on the actuator current and is close to the accuracy of the methods that direct use the wrist-mounted force-torque sensor. Practical implications As the provided experiments indicate, the proposed method expands the application range and greatly improves the accuracy, hence making payload identification fully operational in the industrial application. Originality/value The novelty of such an identification method is that it does not require the rotor inertias and inertial parameters of links as a prior knowledge, and the specially designed trajectories provide completed decoupling of the load parameters.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.