2020
DOI: 10.1109/tsmc.2019.2957000
|View full text |Cite
|
Sign up to set email alerts
|

Reinforcement Q-Learning Algorithm for H Tracking Control of Unknown Discrete-Time Linear Systems

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
58
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 47 publications
(60 citation statements)
references
References 46 publications
0
58
0
Order By: Relevance
“…Then, from (22), (19) and (24), it can conclude that both the matrix H and the matrix N satisfy (15). However, it has been shown in [30] that there is a unique matrix H for (15), thus the contradiction is generated.…”
Section: B Output Feedback Control Designmentioning
confidence: 97%
See 1 more Smart Citation
“…Then, from (22), (19) and (24), it can conclude that both the matrix H and the matrix N satisfy (15). However, it has been shown in [30] that there is a unique matrix H for (15), thus the contradiction is generated.…”
Section: B Output Feedback Control Designmentioning
confidence: 97%
“…Proof: We first show that there is a unique matrixH satisfying (22) and the optimal control policies and the worst disturbances obtained by (23) based on the matrixH are also unique. If there are two different matrices, matrixH and matrixN , such that (22) holds, then we can obtain matrix H and the following matrix N according to (19).…”
Section: B Output Feedback Control Designmentioning
confidence: 99%
“…is property can result in the difficulties for implementing the Q-learning technique of the continuous-time system. Additionally, compared with the data-driven method in [48], this work provides a neural network-based technique to avoid the Kronecker product in estimating the actor/critic term. e actor/critic-based approaches have been discussed in [49,50] for nonlinear affine systems using residual error δ hjb .…”
Section: Arl-based Control Design For Independent Jointsmentioning
confidence: 99%
“…where the coefficients in (54) are mentioned in (55), (56), and (35). Let us consider the control structure (Figure 2), ( 10) with ARL-based control scheme (46), ( 10), (39), and (30), the associated adjusting mechanisms (35) and (33) for the actual controller, and constraint force control vector (48); then, (1) the actor-critic weight errors W a and W c are UUB; (2) the tracking effectiveness of not only 1) is also UUB; (3) the tracking of constraint force coefficient vector λ and the remaining terms of joint variables' vector η 2n � [ξ (2n− m) , p m ] is also UUB.…”
Section: Convergence and Stability Analysismentioning
confidence: 99%
“…The same method has also been used to solve OPFB LQT problem [24] by employing VFA technique. In recent studies, the modelfree state reconstruction technique was also used to develop OPFB Q-learning PI scheme for H ∞ control problem [33], [34].…”
Section: Introductionmentioning
confidence: 99%