Connections Between Adaptive Control and Optimization in Machine Learning

Gaudio, Joseph E.; Gibson, Travis E.; Annaswamy, Anuradha M.; Bolender, Michael A.; Lavretsky, Eugene

doi:10.1109/cdc40024.2019.9029197

Cited by 32 publications

(42 citation statements)

References 60 publications

(93 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…where τ ≪ 1 is a small update rate. This soft update law shares similar concept as low-frequency learning in model reference adaptive control to improve the robustness of the adaptive process [23,24]. The soft-updated two target networks are then used in calculating the TD error as δ t r t γQ w 0 s t1 ; A μ 0 s t1 − Q w s t ; a t (18) With very small update rate, the stability of critic network training greatly improves at the expense of slow training process.…”

Section: A Deep Deterministic Policy Gradientmentioning

confidence: 99%

See 1 more Smart Citation

Computational Missile Guidance: A Deep Reinforcement Learning Approach

Shin

Tsourdos

2021

Journal of Aerospace Information Systems

View full text Add to dashboard Cite

This paper aims to examine the potential of using the emerging deep reinforcement learning techniques in missile guidance applications. To this end, a Markovian decision process that enables the application of reinforcement learning theory to solve the guidance problem is formulated. A heuristic way is used to shape a proper reward function that has tradeoff between guidance accuracy, energy consumption, and interception time. The state-of-the-art deep deterministic policy gradient algorithm is used to learn an action policy that maps the observed engagements states to a guidance command. Extensive empirical numerical simulations are performed to validate the proposed computational guidance algorithm.

show abstract

Section: A Deep Deterministic Policy Gradientmentioning

confidence: 99%

“…The relative kinematics, shown in Eqs. (22)(23)(24)(25)(26)(27), constitutes the environment, which is fully characterized by the engagement state…”

Section: B Reinforcement Learning Problem Formulationmentioning

confidence: 99%

Computational Missile Guidance: A Deep Reinforcement Learning Approach

Shin

Tsourdos

2021

Journal of Aerospace Information Systems

View full text Add to dashboard Cite

show abstract

“…In control systems, the tracking error between the system output and a predefined desired output is the most commonly used optimisation signal for the tuning of the parameters of the system controller (Gaudio et al, 2019; Gerasimov et al, 2019; Humaidi and Hameed, 2019; Wu and Du, 2019; Zhou et al, 2020; Zhou et al, 2017). When accompanied with adaptive control (Chen and Jiao, 2010; Narendra and Annaswamy, 2005; Tao, 2003), the approach has been particularly proven useful to control systems that are affected by model uncertainty, random noises, and that are operating under changing environments and have unforeseen variations in their overall structure.…”

Section: Introductionmentioning

confidence: 99%

A tracking error–based fully probabilistic control for stochastic discrete-time systems with multiplicative noise

Herzallah

Zhou

2020

Journal of Vibration and Control

View full text Add to dashboard Cite

This article proposes the exploitation of the Kullback–Leibler divergence to characterise the uncertainty of the tracking error for general stochastic systems without constraints of certain distributions. The general solution to the fully probabilistic design of the tracking error control problem is first stated. Further development then focuses on the derivation of a randomised controller for a class of linear stochastic Gaussian systems that are affected by multiplicative noise. The derived control solution takes the multiplicative noise of the controlled system into consideration in the derivation of the randomised controller. The proposed fully probabilistic design of the tracking error of the system dynamics is a more legitimate approach than the conventional fully probabilistic design method. It directly characterises the main objective of system control. The efficiency of the proposed method is then demonstrated on a flexible beam example where the vibration quenching in flexible beams is shown to be effectively suppressed.

show abstract

“…10 A majority of these modified adaptive algorithms are applied to first order gradient-like updates. 11,12 Recently, several high-order optimization methods have been developed within the optimization community, which can obtain a stable and fast learning process. 13,14 Differing from the above modification strategy, the optimization process of parameter estimation is directly improved, where an approximated filter of weight is introduced to smoothen and stabilize the adaptive process.…”

Section: Introductionmentioning

confidence: 99%

“…In adaptive control, the core algorithm is often inspired by gradient descent optimization algorithm 10 . A majority of these modified adaptive algorithms are applied to first order gradient‐like updates 11,12 . Recently, several high‐order optimization methods have been developed within the optimization community, which can obtain a stable and fast learning process 13,14 .…”

Section: Introductionmentioning

confidence: 99%

Fast and stable composite learning via high‐order optimization

Tao

Han

2020

Intl J Robust & Nonlinear

View full text Add to dashboard Cite

Fast and stable adaptation is necessary to achieve stringent tracking performance specifications in the face of large system uncertainties. This work develops a novel fast adaption architecture based on a high-order optimization idea, where an approximated filter of weight is applied to smoothen and stabilize the estimation process. Larger learning rate can be selected to achieve fast adaption in that high-frequency uncertainties are attenuated. Moreover, composite learning combined with filtering regression and experience replay technique is utilized to further smoothen and accelerate the parameter estimation process. Given a nonlinear plant with multi-input multi-output strict-feedback structure, the proposed adaptive control is integrated into the backstepping framework. The uniformly bounded property of the tracking errors and the approximation errors is proven by Lyapunov theory. The superiority of the proposed method is demonstrated by comparative simulations.

show abstract

Connections Between Adaptive Control and Optimization in Machine Learning

Cited by 32 publications

References 60 publications

Computational Missile Guidance: A Deep Reinforcement Learning Approach

Computational Missile Guidance: A Deep Reinforcement Learning Approach

A tracking error–based fully probabilistic control for stochastic discrete-time systems with multiplicative noise

Fast and stable composite learning via high‐order optimization

Contact Info

Product

Resources

About