2021
DOI: 10.1016/j.neucom.2020.12.116
|View full text |Cite
|
Sign up to set email alerts
|

Demonstration actor critic

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(1 citation statement)
references
References 4 publications
0
1
0
Order By: Relevance
“…Value-based algorithms learn the optimal value function and then use it to derive the optimal policy, whereas policy-based algorithms learn the optimal policy directly. Actor-critic methods [69,94] are hybrid approaches that use policy-based methods to improve a policy while also evaluating it by estimating its corresponding value function. Several studies, including [22,95], investigated the adaptability of value-based algorithms to environmental changes.…”
Section: Related Workmentioning
confidence: 99%
“…Value-based algorithms learn the optimal value function and then use it to derive the optimal policy, whereas policy-based algorithms learn the optimal policy directly. Actor-critic methods [69,94] are hybrid approaches that use policy-based methods to improve a policy while also evaluating it by estimating its corresponding value function. Several studies, including [22,95], investigated the adaptability of value-based algorithms to environmental changes.…”
Section: Related Workmentioning
confidence: 99%