2015
DOI: 10.48550/arxiv.1511.06581
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Dueling Network Architectures for Deep Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
283
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
3
3

Relationship

0
10

Authors

Journals

citations
Cited by 208 publications
(285 citation statements)
references
References 7 publications
2
283
0
Order By: Relevance
“…The reinforcement learning method adopted in DRLCFA method is based on the idea of V-D D3QN method [38]. This method is a variant of the Double Dueling Q-learning Network, which changes the update mode of the Q Network in a innovative way while retaining the duelling structure [40].…”
Section: Reinforcement Learning Methods and The Agentmentioning
confidence: 99%
“…The reinforcement learning method adopted in DRLCFA method is based on the idea of V-D D3QN method [38]. This method is a variant of the Double Dueling Q-learning Network, which changes the update mode of the Q Network in a innovative way while retaining the duelling structure [40].…”
Section: Reinforcement Learning Methods and The Agentmentioning
confidence: 99%
“…To improve the data efficiency of DQN and expedite its convergence, it is shown in [51] that, instead of uniform sampling, the more surprising transitions should be sampled more frequently, a method which is called prioritized experience replay (PER). Rainbow-DQN, which demonstrates superior performance in comparison to other DQN variants in several Atari games [52], combines some of the best approaches to improve DQN like double Q-learning [53], PER [51], dueling architecture [54], multi-step learning, distributional reinforcement learning [55] and noisy nets [56]. In this paper, we use a double DQN with dueling architecture and PER.…”
Section: Reinforcement Learning For Cppmentioning
confidence: 99%
“…Therefore, the dueling DDQN can generalize the learning process for all actions and has the ability to identify the best actions and important states quickly without learning the effects of each action for each state. The dueling DQN [35] is the improved version of DQN, where the Q-network has two streams (sequences) Q-function (i.e., the state-action value function is decomposed), namely the state value function V π (s) and the advantage function A π (s, a), to speed up the convergence and improve the efficiency. The value function V π (s) is used to represent the quality of being in a particular state (calculating the average contribution of a particular state to the Q-function), and the advantage function A π (s, a) measures the comparative importance of a particular action versus other actions in a particular state.…”
Section: Dueling Double Deep Q Network Algorithmmentioning
confidence: 99%