2021
DOI: 10.48550/arxiv.2110.03608
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

How to Sense the World: Leveraging Hierarchy in Multimodal Perception for Robust Reinforcement Learning Agents

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 0 publications
0
5
0
Order By: Relevance
“…Another approach, the Multimodal Factorization Model (MFM) 43 , proposes the factorization of the multimodal representation into separate, independent representations. Vasco et al 44 proposed a hierarchical design, called MUSE, to learn a hierarchical multimodal representation, beginning with low-level modality-specific representations from raw observation data and ending with a high-level multimodal representation encoding joint-modality information.…”
Section: Multi-modal Perception Learningmentioning
confidence: 99%
“…Another approach, the Multimodal Factorization Model (MFM) 43 , proposes the factorization of the multimodal representation into separate, independent representations. Vasco et al 44 proposed a hierarchical design, called MUSE, to learn a hierarchical multimodal representation, beginning with low-level modality-specific representations from raw observation data and ending with a high-level multimodal representation encoding joint-modality information.…”
Section: Multi-modal Perception Learningmentioning
confidence: 99%
“…Prior work has demonstrated success in using complete representations z 1:M in a diverse set of applications, such as image generation (Wu & Goodman, 2018;Shi et al, 2019) and control of Atari games (Silva et al, 2019;Vasco et al, 2021). Intuitively, if complete representations z 1:M are sufficient to perform a downstream task then learning modality-specific representations z m that are geometrically aligned with z 1:M in the same representation space should ensure that {z m } contain necessary information to perform the task even when z 1:M cannot be provided.…”
Section: The Problem Of Geometric Misalignment In Multimodal Represen...mentioning
confidence: 99%
“…Recently, hierarchical multimodal VAEs have been proposed to facilitate the learning of aligned multimodal representations such as Nexus (Vasco et al, 2022) and Multimodal Sensing (MUSE) (Vasco et al, 2021). Nexus considers a two-level hierarchy of modality-specific and multimodal representation spaces employing a dropout-based training scheme.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations