Latent space policy search for robotics

Torras

2018

IEEE Trans. Robot.

Dynamic Movement Primitives (DMPs) are nowadays widely used as movement parametrization for learning robot trajectories, because of their linearity in the parameters, rescaling robustness and continuity. However, when learning a movement with DMPs, a very large number of Gaussian approximations needs to be performed. Adding them up for all joints yields too many parameters to be explored when using Reinforcement Learning (RL), thus requiring a prohibitive number of experiments/simulations to converge to a solution with a (locally or globally) optimal reward. In this paper we address the process of simultaneously learning a DMPcharacterized robot motion and its underlying joint couplings through linear Dimensionality Reduction (DR), which will provide valuable qualitative information leading to a reduced and intuitive algebraic description of such motion. The results in the experimental section show that not only can we effectively perform DR on DMPs while learning, but we can also obtain better learning curves, as well as additional information about each motion: linear mappings relating joint values and some latent variables.

Section: Dimensionality Reduction In the Parameter Space (Pdr-dmp)mentioning

confidence: 99%

Section: Experimentationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Dimensionality Reduction for Dynamic Movement Primitives and Application to Bimanual Manipulation of Clothes

Torras

2018

IEEE Trans. Robot.

“…In [1], the authors proposed to reduce the dimensionality of a ProMP by performing linear DR in the space of degrees of freedom (DoF) of the robot, to reduce the number of DoFs from d to r. This had the impact of reducing the dimensionality of the parameter vector ω from dN f to rN f , with r < d, N f being the number of Gaussian kernels used per DoF. While reducing the dimensionality in the robots' DoF has advantages such as a better qualitative understanding of the task, a smaller linear projection matrix which is easier to estimate, and it is also used in other approaches [18]; However, here we propose to reduce the dimensionality in the space of the Gaussian weight vectors ω. This variation is introduced to then build a GMM in the common latent parameter space and has the advantage of fine-tuning the dimensionality of the latent space, given that we can encode actions that are more different, without loosing too much information.…”

Section: A Dimensionality Reduction Of Prompsmentioning

confidence: 99%

Dimensionality Reduction in Learning Gaussian Mixture Models of Movement Primitives for Contextualized Action Selection and Adaptation

IEEE Robot. Autom. Lett.

Torras

2018

Robotic manipulation often requires adaptation to changing environments. Such changes can be represented by a certain number of contextual variables that may be observed or sensed in different manners. When learning and representing robot motion-usually with movement primitives-, it is desirable to adapt the learned behaviors to the current context. Moreover, different actions or motions can be considered in the same framework, using contextualization to decide which action applies to which situation. Such frameworks, however, may easily become large-dimensional, thus requiring to reduce the dimensionality of the parameters space, as well as the amount of data needed to generate and improve the model over experience. In this paper, we propose an approach to obtain a generative model from a set of actions that share a common feature. Such feature, namely a contextual variable, is plugged into the model to generate motion. We encode the data with a Gaussian Mixture Model in the parameter space of Probabilistic Movement Primitives (ProMPs), after performing Dimensionality Reduction (DR) on such parameter space, in a similar fashion as in [1]. We append the contextual variable to the parameter space and obtain the number of Gaussian components, i.e., different actions in a dataset, through Persistent Homology. Then, using multimodal Gaussian Mixture Regression (GMR) [2], we can retrieve the most likely actions given a contextual situation and execute them. After actions are executed, we use Reward-Weighted Responsibility GMM (RWR-GMM) update the model after each execution. Experimentation in 3 scenarios shows that the method drastically reduces the dimensionality of the parameter space, thus implementing both action selection and adaptation to a changing situation in an efficient way.

“…Dimensionality reduction over the DoF of robots is a common approach for grasping and hand motion [6], [7]. However, it has been less used for arm robot skills.…”

Section: Introductionmentioning

confidence: 99%

Dimensionality reduction for probabilistic movement primitives

2014 IEEE-RAS International Conference on Humanoid Robots

Neumann

Peters

et al. 2014

Self Cite

Abstract-Humans as well as humanoid robots can use a large number of degrees of freedom to solve very complex motor tasks. The high-dimensionality of these motor tasks adds difficulties to the control problem and machine learning algorithms. However, it is well known that the intrinsic dimensionality of many human movements is small in comparison to the number of employed DoFs, and hence, the movements can be represented by a small number of synergies encoding the couplings between DoFs. In this paper, we want to apply Dimensionality Reduction (DR) to a recent movement representation used in robotics, called Probabilistic Movement Primitives (ProMP). While ProMP have been shown to have many benefits, they suffer with the high-dimensionality of a robotic system as the number of parameters of a ProMP scales quadratically with the dimensionality. We use probablistic dimensionality reduction techniques based on expectation maximization to extract the unknown synergies from a given set of demonstrations. The ProMP representation is now estimated in the low-dimensional space of the synergies. We show that our dimensionality reduction is more efficient both for encoding a trajectory from data and for applying Reinforcement Learning with Relative Entropy Policy Search (REPS).