2024
DOI: 10.1609/aaai.v38i17.29854
|View full text |Cite
|
Sign up to set email alerts
|

MULTISCRIPT: Multimodal Script Learning for Supporting Open Domain Everyday Tasks

Jingyuan Qi,
Minqian Liu,
Ying Shen
et al.

Abstract: Automatically generating scripts (i.e. sequences of key steps described in text) from video demonstrations and reasoning about the subsequent steps are crucial to the modern AI virtual assistants to guide humans to complete everyday tasks, especially unfamiliar ones. However, current methods for generative script learning rely heavily on well-structured preceding steps described in text and/or images or are limited to a certain domain, resulting in a disparity with real-world user scenarios. To address these l… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 21 publications
(23 reference statements)
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?