2019 IEEE Latin American Conference on Computational Intelligence (LA-CCI) 2019
DOI: 10.1109/la-cci47412.2019.9036763
|View full text |Cite
|
Sign up to set email alerts
|

Performing Deep Recurrent Double Q-Learning for Atari Games

Abstract: Currently, many applications in Machine Learning are based on defining new models to extract more information about data, In this case Deep Reinforcement Learning with the most common application in video games like Atari, Mario, and others causes an impact in how to computers can learning by himself with only information called rewards obtained from any action. There is a lot of algorithms modeled and implemented based on Deep Recurrent Q-Learning proposed by DeepMind used in AlphaZero and Go. In this documen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 19 publications
(8 citation statements)
references
References 5 publications
0
5
0
Order By: Relevance
“…In COMPER, update values R(τ t , Ω) used to obtain target values are provided by a recurrent neural network (RNN) parameterized by Ω. That is completely different from some approaches in the literature that have proposed the adoption of recurrent units at the final layers of the target network, such as Hausknecht and Stone (2015) and Moreno-Vera (2019). Here, an RNN is adopted not only to predict values that are used to calculate target values during training but also to built a model that is explored to generate the compact structure of RT M representing previous experiences.…”
Section: Methods Outlinementioning
confidence: 99%
See 2 more Smart Citations
“…In COMPER, update values R(τ t , Ω) used to obtain target values are provided by a recurrent neural network (RNN) parameterized by Ω. That is completely different from some approaches in the literature that have proposed the adoption of recurrent units at the final layers of the target network, such as Hausknecht and Stone (2015) and Moreno-Vera (2019). Here, an RNN is adopted not only to predict values that are used to calculate target values during training but also to built a model that is explored to generate the compact structure of RT M representing previous experiences.…”
Section: Methods Outlinementioning
confidence: 99%
“…They stated that a recurrent network is a viable approach for dealing with observations from multiple states, but it presents no systematic benefits compared to stacking these observations in the input layer of a plain CNN. Moreno-Vera (2019) proposed a similar approach but using DDQN instead of DQN. Wang et al (2016) proposed an architecture named Dueling Network in which they used two parallel streams (instead of a single sequence of fully connected layers) just after the convolutional layers, that are combined in the output by an aggregation layer to produce the estimates of Q-values.…”
Section: Literature Review and Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In order to reduce overestimations, Hasselt et al designed the DDQN [25] from the idea of double Q-learning [26,27]. The online network and the target network are designed to decouple the selection from the evaluation.…”
Section: Dqn and Ddqnmentioning
confidence: 99%
“…[19]. Since then, new ideas have been proposed, and the learning performance of RL using RNN has improved dramatically [20,21,22]. However, the algorithms of these methods tend to be more complex and computationally expensive, and the vanishing/exploding gradients problem still remains.…”
Section: Introductionmentioning
confidence: 99%