2024
DOI: 10.3390/app14114696
|View full text |Cite
|
Sign up to set email alerts
|

Multimodal Attention-Based Instruction-Following Part-Level Affordance Grounding

Wen Qu,
Lulu Guo,
Jian Cui
et al.

Abstract: The integration of language and vision for object affordance understanding is pivotal for the advancement of embodied agents. Current approaches are often limited by reliance on segregated pre-processing stages for language interpretation and object localization, leading to inefficiencies and error propagation in affordance segmentation. To overcome these limitations, this study introduces a unique task, part-level affordance grounding, in direct response to natural language instructions. We present the Instru… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 67 publications
0
0
0
Order By: Relevance
“…This initial step ensures that the performance measurements are not adversely influenced by any noise potentially introduced in the semi-automatically generated data. For this purpose, we utilized two well-established datasets: IIT-AFF VL and UMD VL ( Qu et al, 2024 ). We first introduce the datasets in details, then present the evaluation metrics for the affordance grounding task.…”
Section: Methodsmentioning
confidence: 99%
“…This initial step ensures that the performance measurements are not adversely influenced by any noise potentially introduced in the semi-automatically generated data. For this purpose, we utilized two well-established datasets: IIT-AFF VL and UMD VL ( Qu et al, 2024 ). We first introduce the datasets in details, then present the evaluation metrics for the affordance grounding task.…”
Section: Methodsmentioning
confidence: 99%