This paper is about the exploitation of Lipschitz continuity properties for Markov Decision Processes to safely speed up policy-gradient algorithms. Starting from assumptions about the Lipschitz continuity of the state-transition model, the reward function, and the policies considered in the learning process, we show that both the expected return of a policy and its gradient are Lipschitz continuous w.r.t. policy parameters. By leveraging such properties, we define policy-parameter updates that guarantee a performance improvement at each iteration. The proposed methods are empirically evaluated and compared to other related approaches using different configurations of three popular control scenarios: the linear quadratic regulator, the mass-spring-damper system and the ship-steering control.
The present paper addresses the issues that should be covered in order to develop walk-through programming techniques (i.e. a manual guidance of the robot) in an industrial scenario. First, an exact formulation of the dynamics of the tool the human should feel when interacting with the robot is presented. Then, the paper discusses a way to implement such dynamics on an industrial robot equipped with an open robot control system and a wrist force/torque sensor, as well as the safety issues related to the walk-through programming. In particular, two strategies that make use of admittance control to constrain the robot motion are presented. One slows down the robot when the velocity of the tool centre point exceeds a specified safety limit, the other one limits the robot workspace by way of virtual safety surfaces. Experimental results on a COMAU Smart Six robot are presented, showing the performance of the walk-through programming system endowed with the two proposed safety strategies
This paper proposes a manipulability optimization control of a 7-DoF robot manipulator for Robot-Assisted Minimally Invasive Surgery (RAMIS), which at the same time guarantees a Remote Center of Motion (RCM). The first degree of redundancy of the manipulator is used to achieve an RCM constraint, the second one is adopted for manipulability optimization. A hierarchical operational space formulation is introduced to integrate all the control components, including a Cartesian compliance control involving the main surgical task, a first null-space controller for the RCM constraint, and a second null-space controller for manipulability optimization. Experiments with virtual surgical tasks, in an augmented reality environment, were performed to validate the proposed control strategy using the KUKA LWR4+. The results demonstrate that end-effector accuracy and RCM constraint can be guaranteed, along with improving the manipulability of the surgical tip.
DMPs are a common method for learning a control policy for a task from demonstration.\ud
This control policy consists of differential equations that can create a smooth trajectory to a new goal point.\ud
However, DMPs only have a limited ability to generalize the demonstration to new environments and solve problems such as obstacle avoidance.\ud
Moreover, standard DMP learning does not cope with the noise inherent to human demonstrations.\ud
Here, we propose an approach for robot learning from demonstration that can generalize noisy task demonstrations to a new goal point and to an environment with obstacles.\ud
This strategy for robot learning from demonstration results in a control policy that incorporates different types of learning from demonstration,\ud
which correspond to different types of observational learning as outlined in developmental psychology
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.