Semiparametrical Gaussian Processes Learning of Forward Dynamical Models for Navigating in a Circular Maze

Romeres, Diego; Jha, Devesh K.; Libera, Alberto Dalla; Yerazunis, Bill; Nikovski, Daniel

doi:10.1109/icra.2019.8794229

Cited by 23 publications

(25 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…When prior knowledge about the system dynamics is available, for example given by physics first principles, the so called physically inspired (PI) kernel can be derived. The PI kernel is a linear kernel defined on suitable basis functions φpxq, see for instance [6]. More precisely, φpxq P R d φ is a, possibly nonlinear, transformation of the GP input x determined by the physical model.…”

Section: Squared Exponential (Se)mentioning

confidence: 99%

“…Then we have k P I px tj , xt k q " φ T px tj qΣ P I φpx t k q, where Σ P I is a d φ ˆdφ positive-definite matrix, whose elements are the k P I hyperparameters; to limit the number of hyperparameters, a standard choice consists in considering Σ P I to be diagonal. To compensate possible inaccuracies of the physical model, it is common to combine k P I with an SE kernel, obtaining so called semi-parametric kernels [17,6], expressed as…”

Section: Squared Exponential (Se)mentioning

confidence: 99%

“…model-free RL algorithms. In particular, remarkable results have been obtained relying on Gaussian Processes (GPs) [2] to model the systems dynamics, see for instance [3,4,5,6,7]. In this paper, we cosider the application of MBRL algorithms to PMS, i.e., systems where only a subset of the state components can be directly measured, and the remaining components can be estimated through proper state observer.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Model-based Policy Search for Partially Measurable Systems

Amadio,

Libera,

Carli

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

In this paper, we propose a Model-Based Reinforcement Learning (MBRL) algorithm for Partially Measurable Systems (PMS), i.e., systems where the state can not be directly measured, but must be estimated through proper state observers. The proposed algorithm, named Monte Carlo Probabilistic Inference for Learning COntrol for Partially Measurable Systems (MC-PILCO4PMS), relies on Gaussian Processes (GPs) to model the system dynamics, and on a Monte Carlo approach to update the policy parameters. W.r.t. previous GP-based MBRL algorithms, MC-PILCO4PMS models explicitly the presence of state observers during policy optimization, allowing to deal PMS. The effectiveness of the proposed algorithm has been tested both in simulation and in two real systems.

show abstract

Section: Squared Exponential (Se)mentioning

confidence: 99%

Section: Squared Exponential (Se)mentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Model-based Policy Search for Partially Measurable Systems

Amadio,

Libera,

Carli

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…However, physical parameters of the ODE model and hyperparameters of GPR are learned separately, which may yield a suboptimal model. In [9], [10], instead of discrete-time, continuous-time state transition dynamics are learned. The GPR is used to learn the mapping from positional state and action to acceleration.…”

Section: Introduction and Related Workmentioning

confidence: 99%

Learning Dynamic Systems Using Gaussian Process Regression with Analytic Ordinary Differential Equations as Prior Information

Tang

Fujimoto

Maruta

2021

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

Recently the data-driven learning of dynamic systems has become a promising approach because no physical knowledge is needed. Pure machine learning approaches such as Gaussian process regression (GPR) learns a dynamic model from data, with all physical knowledge about the system discarded. This goes from one extreme, namely methods based on optimizing parametric physical models derived from physical laws, to the other. GPR has high flexibility and is able to model any dynamics as long as they are locally smooth, but can not generalize well to unexplored areas with little or no training data. The analytic physical model derived under assumptions is an abstract approximation of the true system, but has global generalization ability. Hence the optimal learning strategy is to combine GPR with the analytic physical model. This paper proposes a method to learn dynamic systems using GPR with analytic ordinary differential equations (ODEs) as prior information. The one-time-step integration of analytic ODEs is used as the mean function of the Gaussian process prior. The total parameters to be trained include physical parameters of analytic ODEs and parameters of GPR. A novel method is proposed to simultaneously learn all parameters, which is realized by the fully Bayesian GPR and more promising to learn an optimal model. The standard Gaussian process regression, the ODE method and the existing method in the literature are chosen as baselines to verify the benefit of the proposed method. The predictive performance is evaluated by both one-time-step prediction and long-term prediction. By simulation of the cart-pole system, it is demonstrated that the proposed method has better predictive performances.

show abstract

“…However, the performance of the proposed method gradually degrade through time with the thermal model deviates from the actual system. Recently, Bayesian estimation based techniques has also been introduced to the system identification problem [18]- [22]. In particular, prior information is introduced to the identification process by designing a covariance, which is also known as kernel in the machine learning literature.…”

Section: Introductionmentioning

confidence: 99%

Online Thermal Effect Modeling and Prediction of Implantable Devices

Chai

Zhang

2021

IEEE Sensors J.

View full text Add to dashboard Cite

The overheating caused by the operation of implantable device can cause damage to the surrounding tissue. In applications like neural prosthesis, • C of temperature increase could lead to irreversible damage to the subject. Predicting the overheating effect is therefore critical to maintain safe operation. This work proposes a Bayesian recursive multi-step prediction method for implantable device to predict the overheating effect. The method proposed in this article achieves accurate prediction within a horizon with low complexity by model updating that iteratively minimizes a function of the j-step-ahead prediction error. At each time instant, the new available input output data are stored in a First In First Out (FIFO) queue of fixed length, and the model parameters are updated by iteratively minimizing the j-step-ahead prediction error of the new data. Moreover, the regularization methods are introduced to improve the prediction performance by taking the Bayesian interpretation of the parameters into consideration. Monte Carlo simulation studies indicate that the developed method is able to estimate the fundamental dynamics of the system when the prediction model is underparametered, and is robust to measurement noise. For time varying systems, the developed method can capture the system dynamics during the system variation. The proposed method is demonstrated via an in-vitro test vehicle, which shows that the temperature increase can be predicted with high accuracy and low complexity.

show abstract

Semiparametrical Gaussian Processes Learning of Forward Dynamical Models for Navigating in a Circular Maze

Cited by 23 publications

References 40 publications

Model-based Policy Search for Partially Measurable Systems

Model-based Policy Search for Partially Measurable Systems

Learning Dynamic Systems Using Gaussian Process Regression with Analytic Ordinary Differential Equations as Prior Information

Online Thermal Effect Modeling and Prediction of Implantable Devices

Contact Info

Product

Resources

About