2022
DOI: 10.48550/arxiv.2201.09862
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Learning to Act with Affordance-Aware Multimodal Neural SLAM

Abstract: Recent years have witnessed an emerging paradigm shift toward embodied artificial intelligence, in which an agent must learn to solve challenging tasks by interacting with its environment. There are several challenges in solving embodied multimodal tasks, including long-horizon planning, vision-and-language grounding, and efficient exploration. We focus on a critical bottleneck, namely the performance of planning and navigation. To tackle this challenge, we propose a Neural SLAM approach that, for the first ti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 18 publications
0
2
0
Order By: Relevance
“…In simulated environments, Logeswaran et al (2022) propose a language-only finetuned GPT-2 model for task planning on ALFRED . Some end-to-end ALFRED models also have task planning as a component (Min et al, 2021;Jia et al, 2022;Blukis et al, 2022). However, this is a simpler dataset where task planning can be cast as a 7-way classification problem.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In simulated environments, Logeswaran et al (2022) propose a language-only finetuned GPT-2 model for task planning on ALFRED . Some end-to-end ALFRED models also have task planning as a component (Min et al, 2021;Jia et al, 2022;Blukis et al, 2022). However, this is a simpler dataset where task planning can be cast as a 7-way classification problem.…”
Section: Related Workmentioning
confidence: 99%
“…In such a system, the coffee task considered above would likely start by invoking a semantic navigation module to find the mug and a grasping module to pick it up. Some prior work has been on embodied AI benchmarks suggesting that more modular models can outperform monolithic models (Min et al, 2021;Jia et al, 2022;Zheng et al, 2022;Min et al, 2022). However, these do not evaluate and explore the limitations of individual modules.…”
Section: Introductionmentioning
confidence: 99%