2020
DOI: 10.1002/acs.3115
|View full text |Cite
|
Sign up to set email alerts
|

Online optimal and adaptive integral tracking control for varying discrete‐time systems using reinforcement learning

Abstract: SummaryConventional closed‐form solution to the optimal control problem using optimal control theory is only available under the assumption that there are known system dynamics/models described as differential equations. Without such models, reinforcement learning (RL) as a candidate technique has been successfully applied to iteratively solve the optimal control problem for unknown or varying systems. For the optimal tracking control problem, existing RL techniques in the literature assume either the use of a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 37 publications
0
3
0
Order By: Relevance
“…Consequently, the solution to the optimization problem (11) is equivalent to the solution to the min-max optimization problem…”
Section: H∞ Regulation Control With Guaranteed Convergence Ratementioning
confidence: 99%
“…Consequently, the solution to the optimization problem (11) is equivalent to the solution to the min-max optimization problem…”
Section: H∞ Regulation Control With Guaranteed Convergence Ratementioning
confidence: 99%
“…Approximate optimal solutions are obtained under partially unknown nonlinear systems dynamics and perturbations, which is on the basis of successive approximate solutions of HJI equation. 18 An integral reinforcement learning algorithm is proposed 19,20 for online solving the Nash equilibrium point of ZS differential game with policy iterations when the dynamics of the systems are completely unknown.…”
Section: Introductionmentioning
confidence: 99%
“…13 That is, current RL practice has mainly been used to achieve pre-specified goals in structured environments by handcrafting a cost or reward function for which its minimization guarantees reaching the goal. [14][15][16][17][18][19] Strong AI, on the other hand, holds the promise of designing agents that can learn to achieve goals across multiple circumstances by generalizing to unforeseen and novel situations. As the designer cannot foresee all the circumstances that the agent might encounter, pre-specifying and handcrafting the reward function cannot guarantee reaching goals in an uncertain and non-stationary environment.…”
Section: Introductionmentioning
confidence: 99%