2020
DOI: 10.1631/fitee.1900533
|View full text |Cite
|
Sign up to set email alerts
|

Deep reinforcement learning: a survey

Abstract: Deep reinforcement learning (RL) has become one of the most popular topics in artificial intelligence research. It has been widely used in various fields, such as end-to-end control, robotic control, recommendation systems, and natural language dialogue systems. In this survey, we systematically categorize the deep RL algorithms and applications, and provide a detailed review over existing deep RL algorithms by dividing them into modelbased methods, model-free methods, and advanced RL methods. We thoroughly an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
46
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 164 publications
(64 citation statements)
references
References 42 publications
1
46
0
Order By: Relevance
“…While it has been shown that one can learn a control policy end-to-end using deep reinforcement learning (DRL) given high-dimensional observations [31], a significant, sometimes prohibitive amount of data is needed. However, it is possible to take advantage of compact, low-dimensional state representation to improve data efficiency [32].…”
Section: Related Workmentioning
confidence: 99%
“…While it has been shown that one can learn a control policy end-to-end using deep reinforcement learning (DRL) given high-dimensional observations [31], a significant, sometimes prohibitive amount of data is needed. However, it is possible to take advantage of compact, low-dimensional state representation to improve data efficiency [32].…”
Section: Related Workmentioning
confidence: 99%
“…Bilevel optimization method is usually a good choice when faced with a comprehensive solution of multilevel or multi-party interests [27], [28]. Traditional back propagation particle swarm optimization (BPPSO) and reinforcement learning algorithm with action-reward incentive method [29] are also widely used optimization method to realize the process. However, there are few researches in this field at home and abroad, which are mainly faced with two difficulties: computational complexity and feedback accuracy.…”
Section: Figure 1 Rps and Tgc Mechanisms In Chinamentioning
confidence: 99%
“…One method for solving this problem is to effectively combine deep learning with reinforcement learning. A deep neural network is used, in traditional reinforcement learning, to model solutions to continuous reinforcement learning tasks [27,28]. Based on this method, Lillicrap et al [29] proposed a depth deterministic strategy gradient algorithm based on the actor critic framework.…”
Section: Introductionmentioning
confidence: 99%