2018
DOI: 10.1007/978-3-030-04191-5_11
|View full text |Cite
|
Sign up to set email alerts
|

The Dreaming Variational Autoencoder for Reinforcement Learning Environments

Abstract: Reinforcement learning has shown great potential in generalizing over raw sensory data using only a single neural network for value optimization. There are several challenges in the current state-of-the-art reinforcement learning algorithms that prevent them from converging towards the global optima. It is likely that the solution to these problems lies in short-and long-term planning, exploration and memory management for reinforcement learning algorithms. Games are often used to benchmark reinforcement learn… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 13 publications
(14 citation statements)
references
References 12 publications
(21 reference statements)
0
14
0
Order By: Relevance
“…Based on this architecture, the encoded features z are used for clustering the activities. CVAEs are generative models defined in [53], which are commonly used for dimensionality reduction [54], data aug-mentation [55], and reinforcement learning [56]. Considering Doppler radar data, CVAEs have been used for synthetic data generation [57].…”
Section: ) Convolution Filter-based Methodsmentioning
confidence: 99%
“…Based on this architecture, the encoded features z are used for clustering the activities. CVAEs are generative models defined in [53], which are commonly used for dimensionality reduction [54], data aug-mentation [55], and reinforcement learning [56]. Considering Doppler radar data, CVAEs have been used for synthetic data generation [57].…”
Section: ) Convolution Filter-based Methodsmentioning
confidence: 99%
“…In [6], Ha et al proposed World Model, an architecture for modeling the environment using a VAE model and a recurrent neural network (RNN) model, which shows that the agent can learn the optimal policy only use generate training samples. Similarly, Anderson et al [1] proposed Dreaming Variational Autoencoder, an architecture for modeling the environment using VAE and RNN, which uses the real trajectories from the actual environment to imitate the behavior of the actual environment. Conversely, Anderson et al [2] found that in high-dimensional tasks, simple heuristics exploration are often trapped in local minima of the state space, which may cause the generative model to become inaccurate or even collapse.…”
Section: Related Workmentioning
confidence: 99%
“…However, in many scenarios, the training sample may be difficult or time-consuming to obtain. Thus, some researchers attempt to represent the actual environment by using a generative model [1,6,8] to improve sample efficiency. When the generative model is sufficiently trained, the DRL algorithm can be trained without interacts with the actual environment.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Deep learning has been used for a myriad of applications ranging from games to medicine, but its applicability has only partly been explored for fish classification [10][11][12][13][14]. A specific Convolutional Neural Network (CNN) called Fast R-CNN has been applied for object detection to extract the fish from images taken in natural environment and actively ignoring background noise [6].…”
Section: Introductionmentioning
confidence: 99%