Contextual Policy Search for Micro-Data Robot Motion Learning through Covariate Gaussian Process Latent Variable Models

Delgado-Guerrero, Juan Antonio; Colomé, Adrià; Torras, Carme

doi:10.1109/iros45743.2020.9340709

Cited by 3 publications

(2 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Exploiting the model means we will use the standard UCB method and set κ " 2, meaning we will add two standard deviations to the mean in order to find the next sampling point. This value of κ showed to be a good tradeoff in our previous works [21]. Exploring the model results in a new κ that modulates the relation between the effects of exploration (find best-performing sample) and exploitation (find sample that provides the most information to the IRL model).…”

Section: A Bayesian Optimization For Improving the Modelmentioning

confidence: 92%

Ordinal Inverse Reinforcement Learning Applied to Robot Learning with Small Data

Colomé¹,

Torras²

2022

2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Self Cite

View full text Add to dashboard Cite

Over the last decade, the ability to teach actions to robots in a user-friendly way has gained relevance, and a practical way of teaching robots a new task is to use Inverse Reinforcement Learning (IRL). In IRL, an expert teacher shows the robot a desired behaviour and an agent builds a model of the reward. The agent can also infer a policy that performs in an optimal way within the limitations of the knowledge provided to it. However, most IRL approaches assume an (almost) optimal performance of the teaching agent, which might become unpractical if the teacher is not actually an expert. In addition, most IRL focus on discrete state-action spaces that limit their applicability to certain real-world problems such as within the context of direct Policy Search (PS) reinforcement learning. Therefore, in this paper we introduce Ordinal Inverse Reinforcement Learning (OrdIRL) for continuous state variables, in which the teacher can qualitatively evaluate robot performance by selecting one among the predefined performance levels (e.g. tbad, medium, goodu for three tiers of performance). Once the OrdIRL has fit an ordinal distribution to the data, we propose to use Bayesian Optimization (BO) to either gain knowledge on the inferred model (exploration) or find a policy or action that maximizes the expected reward given the prior knowledge on the reward (exploitation). In the case of large-dimensional stateaction spaces, we use Dimensionality Reduction (DR) techniques and perform the BO in the latent space. Experimental results on simulation and with a robot arm show how this approach allows for learning the reward function with small data.

show abstract

Section: A Bayesian Optimization For Improving the Modelmentioning

confidence: 92%

Ordinal Inverse Reinforcement Learning Applied to Robot Learning with Small Data

Colomé¹,

Torras²

2022

2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Self Cite

View full text Add to dashboard Cite

show abstract

“…Nonlinear methods, such as Gaussian Process Latent Variable Models (GPLVM) [6] have also been applied for this purpose. In [7], GPLVM was employed to project task-specific motor-skills of the robot onto a much smaller state representation, whereas in [8] GPLVM was also used to represent a robot manipulation policy in a latent space, taking contextual features into account. However, these approaches focus the dimensionality reduction in the robot action characterization, rather than in the manipulated object's dynamics.…”

Section: Introductionmentioning

confidence: 99%

Controlled Gaussian Process Dynamical Models with Application to Robotic Cloth Manipulation

Amadio¹,

Delgado-Guerrero²,

Colomé³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Over the last years, robotic cloth manipulation has gained relevance within the research community. While significant advances have been made in robotic manipulation of rigid objects, the manipulation of non-rigid objects such as cloth garments is still a challenging problem. The uncertainty on how cloth behaves often requires the use of model-based approaches. However, cloth models have a very high dimensionality. Therefore, it is difficult to find a middle point between providing a manipulator with a dynamics model of cloth and working with a state space of tractable dimensionality. For this reason, most cloth manipulation approaches in literature perform static or quasi-static manipulation. In this paper, we propose a variation of Gaussian Process Dynamical Models (GPDMs) to model cloth dynamics in a low-dimensional manifold. GPDMs project a high-dimensional state space into a smaller dimension latent space which is capable of keeping the dynamic properties. Using such approach, we add control variables to the original formulation. In this way, it is possible to take into account the robot commands exerted on the cloth dynamics. We call this new version Controlled Gaussian Process Dynamical Model (C-GPDM). Moreover, we propose an alternative kernel representation for the model, characterized by a richer parameterization than the one employed in the majority of previous GPDM realizations. The modeling capacity of our proposal has been tested in a simulated scenario, where C-GPDM proved to be capable of generalizing over a considerably wide range of movements and correctly predicting the cloth oscillations generated by previously unseen sequences of control actions.

show abstract

Controlled Gaussian process dynamical models with application to robotic cloth manipulation

Amadio

Delgado-Guerrero

Colomé

et al. 2023

Int. J. Dynam. Control

View full text Add to dashboard Cite

Over the last years, significant advances have been made in robotic manipulation, but still, the handling of non-rigid objects, such as cloth garments, is an open problem. Physical interaction with non-rigid objects is uncertain and complex to model. Thus, extracting useful information from sample data can considerably improve modeling performance. However, the training of such models is a challenging task due to the high-dimensionality of the state representation. In this paper, we propose Controlled Gaussian Process Dynamical Models (CGPDMs) for learning high-dimensional, nonlinear dynamics by embedding them in a low-dimensional manifold. A CGPDM is constituted by a low-dimensional latent space, with an associated dynamics where external control variables can act and a mapping to the observation space. The parameters of both maps are marginalized out by considering Gaussian Process priors. Hence, a CGPDM projects a high-dimensional state space into a smaller dimension latent space, in which it is feasible to learn the system dynamics from training data. The modeling capacity of CGPDM has been tested in both a simulated and a real scenario, where it proved to be capable of generalizing over a wide range of movements and confidently predicting the cloth motions obtained by previously unseen sequences of control actions.

show abstract

Contextual Policy Search for Micro-Data Robot Motion Learning through Covariate Gaussian Process Latent Variable Models

Cited by 3 publications

References 18 publications

Ordinal Inverse Reinforcement Learning Applied to Robot Learning with Small Data

Ordinal Inverse Reinforcement Learning Applied to Robot Learning with Small Data

Controlled Gaussian Process Dynamical Models with Application to Robotic Cloth Manipulation

Controlled Gaussian process dynamical models with application to robotic cloth manipulation

Contact Info

Product

Resources

About