Towards Generalization and Simplicity in Continuous Control

Rajeswaran, Aravind; Lowrey, Kendall; Todorov, Emanuel; Kakade, Sham M.

doi:10.48550/arxiv.1703.02660

Cited by 10 publications

(30 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…u exog (t) can be determined using u r (t), x r (t), and (37) as φ(x) is a known function of x. It is clear from ( 39) -( 41) that if there are no parametric uncertainties, and if the initial conditions of ( 35) are identical to those of (36), with K(0) = I, Θ nl (0) = Θ nl,r and Θ l (0) = Θ l,r , then the AC-RL policy coincides with u r (t). For the rest of this paper, unless otherwise mentioned, we choose Q = 2I.The following theorem presents the stability property of the AC-RL as well as Regret (defined in ( 14)):…”

Section: Ac-rlmentioning

confidence: 99%

“…Theorem 2. Under Assumptions A1-A2, A4', and A5, the closed-loop adaptive system specified by (35), (36), (38) and (43) has globally bounded solutions with lim t→∞ e(t) = 0 with R = O(1).…”

Section: Htac-rlmentioning

confidence: 99%

“…Theorem 3. Under Assumptions A1-A3 and A4', the closed-loop adaptive system specified by the target system (35), the reference system (36), the magnitude constraint in (46) and the MSAC-RL controller given by ( 48)-( 51) will (i) have globally bounded solutions, with lim t→∞ e u (t) = 0 if the target system in (35) is open-loop stable.…”

Section: Msac-rlmentioning

confidence: 99%

“…As Theorems 1-3 pertain to the same plant (35), reference system (36), and control input in (8), the following derivation is common to their proofs. From Assumption A3 and Eq.…”

Section: Appendixmentioning

confidence: 99%

“…In practice, however, offline policies trained in simulation often exhibit degenerate performance when used for real-time control due to modeling errors that can occur online [32,33]. It may be difficult to reliably predict the behavior of a learned policy when it is applied to an environment different from the one seen during training [34][35][36][37]. The first set of approaches we propose in this paper seeks to bridge this "sim-to-real" gap by proposing various combinations of the AC and RL approaches so as to realize their individual advantages and minimize their weaknesses.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Integration of adaptive control and reinforcement learning for real-time control and learning

Annaswamy¹,

Guha²,

Cui³

et al. 2021

Preprint

View full text Add to dashboard Cite

This paper considers the problem of real-time control and learning in dynamic systems subjected to uncertainties. Adaptive approaches are proposed to address the problem, which are combined with methods and tools in Reinforcement Learning (RL) and Machine Learning (ML). Algorithms are proposed in continuous-time that combine adaptive approaches with RL leading to online control policies that guarantee stable behavior in the presence of parametric uncertainties that occur in real-time. Algorithms are proposed in discrete-time that combine adaptive approaches proposed for parameter and output estimation and ML approaches proposed for accelerated performance that guarantee stable estimation even in the presence of time-varying regressors, and for accelerated learning of the parameters with persistent excitation. Numerical validations of all algorithms are carried out using a quadrotor landing task on a moving platform and benchmark problems in ML. All results clearly point out the advantage of the proposed integrative approaches for real-time control and learning.

show abstract

Section: Ac-rlmentioning

confidence: 99%

Section: Htac-rlmentioning

confidence: 99%

Section: Msac-rlmentioning

confidence: 99%

“…As Theorems 1-3 pertain to the same plant (35), reference system (36), and control input in (8), the following derivation is common to their proofs. From Assumption A3 and Eq.…”

Section: Appendixmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Integration of adaptive control and reinforcement learning for real-time control and learning

Annaswamy¹,

Guha²,

Cui³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

Using a half cheetah habitat for random augmentation computing

Kishor

2024

Multimed Tools Appl

View full text Add to dashboard Cite

The Effect of Training Schedules on Morphological Robustness and Generalization

Barba,

Yaman,

Iacca

2024

Proceedings of the Genetic and Evolutionary Computation Conference Companion

View full text Add to dashboard Cite

Robustness and generalizability are the key properties of artificial neural network (ANN)-based controllers for maintaining a reliable performance in case of changes. It is demonstrated that exposing the ANNs to variations during training processes can improve their robustness and generalization capabilities. However, the way in which this variation is introduced can have a significant impact. In this paper, we define various training schedules to specify how these variations are introduced during an evolutionary learning process. In particular, we focus on morphological robustness and generalizability concerned with finding an ANN-based controller that can provide sufficient performance on a range of physical variations. Then, we perform an extensive analysis of the effect of these training schedules on morphological generalization. Furthermore, we formalize the process of training sample selection (i.e., morphological variations) to improve generalization as a reinforcement learning problem. Overall, our results provide deeper insights into the role of variability and the ways of enhancing the generalization property of evolved ANN-based controllers. CCS CONCEPTS• Theory of computation → Evolutionary algorithms; • Computing methodologies → Neural networks.

show abstract

Towards Generalization and Simplicity in Continuous Control

Cited by 10 publications

References 17 publications

Integration of adaptive control and reinforcement learning for real-time control and learning

Integration of adaptive control and reinforcement learning for real-time control and learning

Using a half cheetah habitat for random augmentation computing

The Effect of Training Schedules on Morphological Robustness and Generalization

Contact Info

Product

Resources

About