Dreaming: Model-based Reinforcement Learning by Latent Imagination without Reconstruction

Okada, Masashi; Taniguchi, Tadahiro

doi:10.1109/icra48506.2021.9560734

Cited by 40 publications

(37 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…While previous work often found contrastive learning to be ineffective, we show that combining it with recurrent state space models makes it work. Recently, a contrastive variant of Dreamer (Okada & Taniguchi, 2020) has been proposed which shares the same motivation. Concurrent with our work, Nguyen et al ( 2021) explore a formulation similar to ours based on temporal predictive coding, but do not evaluate it on the difficult camera and color distractions we do here.…”

Section: Discussion and Related Workmentioning

confidence: 99%

Robust Robotic Control from Pixels using Contrastive Recurrent State-Space Models

Srivastava¹,

Talbott²,

Bertran³

et al. 2021

Preprint

View full text Add to dashboard Cite

Modeling the world can benefit robot learning by providing a rich training signal for shaping an agent's latent state space. However, learning world models in unconstrained environments over high-dimensional observation spaces such as images is challenging. One source of difficulty is the presence of irrelevant but hard-to-model background distractions, and unimportant visual details of taskrelevant entities. We address this issue by learning a recurrent latent dynamics model which contrastively predicts the next observation. This simple model leads to surprisingly robust robotic control even with simultaneous camera, background, and color distractions. We outperform alternatives such as bisimulation methods which impose state-similarity measures derived from divergence in future reward or future optimal actions. We obtain state-of-the-art results on the Distracting Control Suite, a challenging benchmark for pixel-based robotic control. Code for our model can be found at https://github.com/apple/ml-core.

show abstract

Section: Discussion and Related Workmentioning

confidence: 99%

Robust Robotic Control from Pixels using Contrastive Recurrent State-Space Models

Srivastava¹,

Talbott²,

Bertran³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Relying on image reconstruction can however lead to vulnerability to visual noise: to overcome this limitation Okada and Taniguchi [33] and Zhang et al [43] forgo the decoder network, while the latter proposes to rely on the notion of bisimilarity to learn meaningful representations. Similarly, Gelada et al [16] only learn to predict rewards and action-conditional state distributions, but only study this task as an additional loss to model-free reinforcement learning methods.…”

Section: Related Workmentioning

confidence: 99%

Planning from Pixels in Environments with Combinatorially Hard Search Spaces

Marco¹,

Olšák²,

Rolínek³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…World models [1] are a potential approach to achieving the visual servoing of robots in the industry. The world models, which are equipped with compact latent representation models and latent forward dynamics, efficiently predict future trajectories and rewards, allowing us to acquire model predictive controllers [2]- [4] and policies learned by model-based reinforcement learning [5]- [7]. In addition, world modes have various valuable properties for industrial applications, such as transferability to new tasks [8], unsupervised exploration [9], generalization from offline datasets [10], and explainability [11].…”

Section: Introductionmentioning

confidence: 99%

“…DreamerV2 [6] is a leading type of world model based reinforcement learning that achieved human-level performance on the Atari benchmark. Unlike previous world models [2], [3], [7], including Dreamer [5] (the earlier version of Dream-erV2), this method uses discrete world models in which discrete random variables represent latent states. A motivation to introduce discrete representation is that categorical distributions can naturally capture multimodal uncertainty of stochastic state transitions.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

DreamingV2: Reinforcement Learning with Discrete World Models without Reconstruction

Okada¹,

Taniguchi²

2022

Preprint

Self Cite

View full text Add to dashboard Cite

The present paper proposes a novel reinforcement learning method with world models, DreamingV2, a collaborative extension of DreamerV2 and Dreaming. Dream-erV2 is a cutting-edge model-based reinforcement learning from pixels that uses discrete world models to represent latent states with categorical variables. Dreaming is also a form of reinforcement learning from pixels that attempts to avoid the autoencoding process in general world model training by involving a reconstruction-free contrastive learning objective. The proposed DreamingV2 is a novel approach of adopting both the discrete representation of DreamingV2 and the reconstruction-free objective of Dreaming. Compared to DreamerV2 and other recent model-based methods without reconstruction, DreamingV2 achieves the best scores on five simulated challenging 3D robot arm tasks. We believe that DreamingV2 will be a reliable solution for robot learning since its discrete representation is suitable to describe discontinuous environments, and the reconstruction-free fashion well manages complex vision observations.

show abstract

Dreaming: Model-based Reinforcement Learning by Latent Imagination without Reconstruction

Cited by 40 publications

References 5 publications

Robust Robotic Control from Pixels using Contrastive Recurrent State-Space Models

Robust Robotic Control from Pixels using Contrastive Recurrent State-Space Models

Planning from Pixels in Environments with Combinatorially Hard Search Spaces

DreamingV2: Reinforcement Learning with Discrete World Models without Reconstruction

Contact Info

Product

Resources

About