2019 IEEE 58th Conference on Decision and Control (CDC) 2019
DOI: 10.1109/cdc40024.2019.9029197
|View full text |Cite
|
Sign up to set email alerts
|

Connections Between Adaptive Control and Optimization in Machine Learning

Abstract: This paper demonstrates many immediate connections between adaptive control and optimization methods commonly employed in machine learning. Starting from common output error formulations, similarities in update law modifications are examined. Concepts in stability, performance, and learning, common to both fields are then discussed. Building on the similarities in update laws and common concepts, new intersections and opportunities for improved algorithm analysis are provided. In particular, a specific problem… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
34
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 32 publications
(42 citation statements)
references
References 60 publications
(93 reference statements)
2
34
0
Order By: Relevance
“…where τ ≪ 1 is a small update rate. This soft update law shares similar concept as low-frequency learning in model reference adaptive control to improve the robustness of the adaptive process [23,24]. The soft-updated two target networks are then used in calculating the TD error as δ t r t γQ w 0 s t1 ; A μ 0 s t1 − Q w s t ; a t (18) With very small update rate, the stability of critic network training greatly improves at the expense of slow training process.…”
Section: A Deep Deterministic Policy Gradientmentioning
confidence: 99%
See 1 more Smart Citation
“…where τ ≪ 1 is a small update rate. This soft update law shares similar concept as low-frequency learning in model reference adaptive control to improve the robustness of the adaptive process [23,24]. The soft-updated two target networks are then used in calculating the TD error as δ t r t γQ w 0 s t1 ; A μ 0 s t1 − Q w s t ; a t (18) With very small update rate, the stability of critic network training greatly improves at the expense of slow training process.…”
Section: A Deep Deterministic Policy Gradientmentioning
confidence: 99%
“…The relative kinematics, shown in Eqs. (22)(23)(24)(25)(26)(27), constitutes the environment, which is fully characterized by the engagement state…”
Section: B Reinforcement Learning Problem Formulationmentioning
confidence: 99%
“…In control systems, the tracking error between the system output and a predefined desired output is the most commonly used optimisation signal for the tuning of the parameters of the system controller (Gaudio et al, 2019; Gerasimov et al, 2019; Humaidi and Hameed, 2019; Wu and Du, 2019; Zhou et al, 2020; Zhou et al, 2017). When accompanied with adaptive control (Chen and Jiao, 2010; Narendra and Annaswamy, 2005; Tao, 2003), the approach has been particularly proven useful to control systems that are affected by model uncertainty, random noises, and that are operating under changing environments and have unforeseen variations in their overall structure.…”
Section: Introductionmentioning
confidence: 99%
“…10 A majority of these modified adaptive algorithms are applied to first order gradient-like updates. 11,12 Recently, several high-order optimization methods have been developed within the optimization community, which can obtain a stable and fast learning process. 13,14 Differing from the above modification strategy, the optimization process of parameter estimation is directly improved, where an approximated filter of weight is introduced to smoothen and stabilize the adaptive process.…”
Section: Introductionmentioning
confidence: 99%
“…In adaptive control, the core algorithm is often inspired by gradient descent optimization algorithm 10 . A majority of these modified adaptive algorithms are applied to first order gradient‐like updates 11,12 . Recently, several high‐order optimization methods have been developed within the optimization community, which can obtain a stable and fast learning process 13,14 .…”
Section: Introductionmentioning
confidence: 99%