Abstract:The Koopman operator has become an essential tool for data-driven approximation of dynamical (control) systems, e.g., via extended dynamic mode decomposition. Despite its popularity, convergence results and, in particular, error bounds are still scarce. In this paper, we derive probabilistic bounds for the approximation error and the prediction error depending on the number of training data points, for both ordinary and stochastic differential equations while using either ergodic trajectories or i.i.d. samples… Show more
“…At this point, suitable candidates appear to be GRUs instead of LSTMs (see, e.g., [54]), sparse regreesion techinques to identify governing equations [6], [55], and in particular the highly popular Koopman operator [56]- [58], as it allows us to learn linear models of nonlinear systems, which is very efficient both in terms of the required training data and the run time. Finally, it might be worth looking into recent prediction error results for these methods [59]- [61] and see whether they can be transferred into guarantees for the RL process.…”
The goal of this paper is to make a strong point for the usage of dynamical models when using reinforcement learning (RL) for feedback control of dynamical systems governed by partial differential equations (PDEs). To breach the gap between the immense promises we see in RL and the applicability in complex engineering systems, the main challenges are the massive requirements in terms of the training data, as well as the lack of performance guarantees. We present a solution for the first issue using a data-driven surrogate model in the form of a convolutional LSTM with actuation. We demonstrate that learning an actuated model in parallel to training the RL agent significantly reduces the total amount of required data sampled from the real system. Furthermore, we show that iteratively updating the model is of major importance to avoid biases in the RL training. Detailed ablation studies reveal the most important ingredients of the modeling process. We use the chaotic Kuramoto-Sivashinsky equation do demonstarte our findings.
“…At this point, suitable candidates appear to be GRUs instead of LSTMs (see, e.g., [54]), sparse regreesion techinques to identify governing equations [6], [55], and in particular the highly popular Koopman operator [56]- [58], as it allows us to learn linear models of nonlinear systems, which is very efficient both in terms of the required training data and the run time. Finally, it might be worth looking into recent prediction error results for these methods [59]- [61] and see whether they can be transferred into guarantees for the RL process.…”
The goal of this paper is to make a strong point for the usage of dynamical models when using reinforcement learning (RL) for feedback control of dynamical systems governed by partial differential equations (PDEs). To breach the gap between the immense promises we see in RL and the applicability in complex engineering systems, the main challenges are the massive requirements in terms of the training data, as well as the lack of performance guarantees. We present a solution for the first issue using a data-driven surrogate model in the form of a convolutional LSTM with actuation. We demonstrate that learning an actuated model in parallel to training the RL agent significantly reduces the total amount of required data sampled from the real system. Furthermore, we show that iteratively updating the model is of major importance to avoid biases in the RL training. Detailed ablation studies reveal the most important ingredients of the modeling process. We use the chaotic Kuramoto-Sivashinsky equation do demonstarte our findings.
“…In order to apply the proposed design scheme, we predefine the outer bound ∆ Φ on the safe operating region following the approach in [18, Procedure 7]. In particular, we first solve (15) for Qz = −I, Sz = 0, and arbitrary Rz > 0 without considering the second liner matrix inequality (16). This leads to a matrix P which relates the measured data and the chosen observables to infer a likely behavior of the closed-loop system, such that the safe operating region is maximized according to the underlying system dynamics.…”
Section: Nonlinear Inverted Pendulummentioning
confidence: 99%
“…Part I: x ∈ X SOR implies Φ(x) ∈ ∆ Φ : In order to represent the lifted dynamics via (40), we require Φ(x) ∈ ∆ Φ for all times. To this end, we exploit that (16) after dividing the inequality by ν. We rewrite this inequality as…”
Section: B Proof Of Theorem 41mentioning
confidence: 99%
“…To this end, [15] proved convergence of EDMD to the Koopman operator in the infinite-data limit. First finite-data error bounds, including bilinear EDMD for control systems, were proven in [16]. The error bounds in these work, however, are global in the sense that the error does not vanish at the origin, which prevents a straightforward application of standard control methods.…”
Data-driven analysis and control of dynamical systems have gained a lot of interest in recent years. While the class of linear systems is well studied, theoretical results for nonlinear systems are still rare. In this paper, we present a data-driven controller design method for discretetime control-affine nonlinear systems. Our approach relies on the Koopman operator, which is a linear but infinite-dimensional operator lifting the nonlinear system to a higher-dimensional space. Particularly, we derive a linear fractional representation of a lifted bilinear system representation based on measured data. Further, we restrict the lifting to finite dimensions, but account for the truncation error using a finite-gain argument. We derive a linear matrix inequality based design procedure to guarantee robust local stability for the resulting bilinear system for all error terms satisfying the finite-gain bound and, thus, also for the underlying nonlinear system. Finally, we apply the developed design method to the nonlinear Van der Pol oscillator.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.