A Data-Efficient Geometrically Inspired Polynomial Kernel for Robot Inverse Dynamic

Libera, Alberto Dalla; Carli, Ruggero

doi:10.1109/lra.2019.2945240

Cited by 14 publications

(17 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The idea motivating this choice is the following: the MP kernel allows capturing possible modes of the system that are polynomial functions in x, which are typical in mechanical systems [16], while the SE kernel models more complex behaviors not captured by the polynomial kernel.…”

Section: Squared Exponential (Se)mentioning

confidence: 99%

Model-based Policy Search for Partially Measurable Systems

Amadio,

Libera,

Carli

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

In this paper, we propose a Model-Based Reinforcement Learning (MBRL) algorithm for Partially Measurable Systems (PMS), i.e., systems where the state can not be directly measured, but must be estimated through proper state observers. The proposed algorithm, named Monte Carlo Probabilistic Inference for Learning COntrol for Partially Measurable Systems (MC-PILCO4PMS), relies on Gaussian Processes (GPs) to model the system dynamics, and on a Monte Carlo approach to update the policy parameters. W.r.t. previous GP-based MBRL algorithms, MC-PILCO4PMS models explicitly the presence of state observers during policy optimization, allowing to deal PMS. The effectiveness of the proposed algorithm has been tested both in simulation and in two real systems.

show abstract

Section: Squared Exponential (Se)mentioning

confidence: 99%

Model-based Policy Search for Partially Measurable Systems

Amadio,

Libera,

Carli

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Thus, in this setting, it is possible to analytically compute the policy gradient from long-term predictions. However, as already mentioned in Section I, the Gaussian approximation performed in moment matching is also the cause of two main weaknesses: (i) The computation of the two moments has been performed assuming the use of SE kernels, which might lead to poor generalization properties in data that have not been seen during training [9], [10], [11], [12]. (ii) Moment matching allows modeling only unimodal distributions, which might be a too restrictive approximation of the real system behavior.…”

Section: B Gpr and One-step-ahead Predictionsmentioning

confidence: 99%

“…The proposed speed-integration model learns only d x {2 GPs, each of which models the evolution of a distinct velocity component ∆ pi k q t , with i k P I 9 q . Then, the evolution of the position change, ∆ pi k q t , with i k P I q is computed according to (9) and the predicted change in velocity.…”

Section: A Model Learningmentioning

confidence: 99%

“…(ii) The computation of the moments is shown to be tractable only when considering Squared Exponential (SE) kernels and differentiable cost functions. In particular, the limitation on the kernel choice might be very stringent, as GPs with SE kernel impose smooth properties on the posterior estimator and might show poor generalization properties in data that have not be seen during training [9], [10], [11], [12].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Model-Based Policy Search Using Monte Carlo Gradient Estimation with Real Systems Application

Amadio,

Libera,

Antonello

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

In this paper, we present a Model-Based Reinforcement Learning algorithm named Monte Carlo Probabilistic Inference for Learning COntrol (MC-PILCO). The algorithm relies on Gaussian Processes (GPs) to model the system dynamics and on a Monte Carlo approach to estimate the policy gradient. This defines a framework in which we ablate the choice of the following components: (i) the selection of the cost function, (ii) the optimization of policies using dropout, (iii) an improved data efficiency through the use of structured kernels in the GP models. The combination of the aforementioned aspects affects dramatically the performance of MC-PILCO. Numerical comparisons in a simulated cart-pole environment show that MC-PILCO exhibits better data-efficiency and control performance w.r.t. state-of-theart GP-based MBRL algorithms. Finally, we apply MC-PILCO to real systems, considering in particular systems with partially measurable states. We discuss the importance of modeling both the measurement system and the state estimators during policy optimization. The effectiveness of the proposed solutions has been tested in simulation and in two real systems, a Furuta pendulum and a ball-and-plate.

show abstract

“…with the mean value µ θ (k + i + 1|k) and variance Σ θ (k + i + 1|k) calculations similar to (30). The difference between models (46) and (28) is that the former depends on the actual internal state α(k + i), while the latter uses the estimated internal stateα(k + i|k). Model 28is actually used for θ trajectory prediction through the MPC formulation.…”

Section: Control Performance Analysismentioning

confidence: 99%

Gaussian Processes Model-Based Control of Underactuated Balance Robots

Chen

Song

2019

2019 International Conference on Robotics and Automation (ICRA)

View full text Add to dashboard Cite

Ranging from cart-pole systems and autonomous bicycles to bipedal robots, control of these underactuated balance robots aims to achieve both external (actuated) subsystem trajectory tracking and internal (unactuated) subsystem balancing tasks with limited actuation authority. This paper proposes a learning model-based control framework for underactuated balance robots. The key idea to simultaneously achieve tracking and balancing tasks is to design control strategies in slow-and fast-time scales, respectively. In slow-time scale, model predictive control (MPC) is used to generate the desired internal subsystem trajectory that encodes the external subsystem tracking performance and control input. In fast-time scale, the actual internal trajectory is stabilized to the desired internal trajectory by using an inverse dynamics controller. The coupling effects between the external and internal subsystems are captured through the planned internal trajectory profile and the dual structural properties of the robotic systems. The control design is based on Gaussian processes (GPs) regression model that are learned from experiments without need of priori knowledge about the robot dynamics nor successful balance demonstration. The GPs provide estimates of modeling uncertainties of the robotic systems and these uncertainty estimations are incorporated in the MPC design to enhance the control robustness to modeling errors. The learning-based control design is analyzed with guaranteed stability and performance. The proposed design is demonstrated by experiments on a Furuta pendulum and an autonomous bikebot.

show abstract

A Data-Efficient Geometrically Inspired Polynomial Kernel for Robot Inverse Dynamic

Cited by 14 publications

References 12 publications

Model-based Policy Search for Partially Measurable Systems

Model-based Policy Search for Partially Measurable Systems

Model-Based Policy Search Using Monte Carlo Gradient Estimation with Real Systems Application

Gaussian Processes Model-Based Control of Underactuated Balance Robots

Contact Info

Product

Resources

About