2011
DOI: 10.3182/20110828-6-it-1002.00759
|View full text |Cite
|
Sign up to set email alerts
|

Actor-Critic Control with Reference Model Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 13 publications
(3 citation statements)
references
References 15 publications
(9 reference statements)
0
3
0
Order By: Relevance
“…where Q υ ðs, aÞ is state-action value function and transition function is fs t , a t , r t+1 , s t+1 , a t+1 g. By using TD error, the critic parameter v can be updated by the gradientdescent update rule [30] which is represented in the 7 Wireless Communications and Mobile Computing following equation:…”
Section: Actor Part Policy Gradient (Pg) Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…where Q υ ðs, aÞ is state-action value function and transition function is fs t , a t , r t+1 , s t+1 , a t+1 g. By using TD error, the critic parameter v can be updated by the gradientdescent update rule [30] which is represented in the 7 Wireless Communications and Mobile Computing following equation:…”
Section: Actor Part Policy Gradient (Pg) Methodsmentioning
confidence: 99%
“…In cloud computing, all incoming tasks should be allocated to suitable VMs for executing in minimum time. It depends upon the capacity of the VM that is represented in Equation (30). Load of the VM is calculated in Equation (31).…”
Section: Hpsoac Scheduling Policy For Parameter Updatingmentioning
confidence: 99%
“…Recent studies show that RL can calculate the optimal impedance model time-varying environments (Wang et al , 2015). The optimal critical algorithm (Grondman et al , 2012a, 2012b, 2011) is applied to obtain the optimum impedance strength. The parameters of the impedance model are estimated by the exponential weighted minimum square (Astrom and Wittenmark, 1989), but it needs a parameterization model for the impedance parameters (Chih and Huang, 2004).…”
Section: Introductionmentioning
confidence: 99%