2021
DOI: 10.1109/access.2021.3113781
|View full text |Cite
|
Sign up to set email alerts
|

Probing Spatial Clues: Canonical Spatial Templates for Object Relationship Understanding

Abstract: Humans often leverage spatial clues to categorize scenes in a fraction of a second. This form of intelligence is very relevant in time-critical situations (e.g., when driving a car) and valuable to transfer to automated systems. This work investigates the predictive power of solely processing spatial clues for scene understanding in 2D images and compares such an approach with the predictive power of visual appearance. To this end, we design the laboratory task of predicting the identity of two objects (e.g., … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 95 publications
(103 reference statements)
0
3
0
Order By: Relevance
“…With regards to grounding text to physical locations, Lourentzou, Morales, and Zhai (2017) focus on predicting physical geographic origins of Twitter posts, while Grujicic et al (2020) localize medical text referring to anatomical concepts to their corresponding physical locations in the human body. Finally, (Collell, Deruyttere, and Moens 2021) have also worked on understanding the implicit spatial relationships of queries and objects.…”
Section: Query Understandingmentioning
confidence: 99%
“…With regards to grounding text to physical locations, Lourentzou, Morales, and Zhai (2017) focus on predicting physical geographic origins of Twitter posts, while Grujicic et al (2020) localize medical text referring to anatomical concepts to their corresponding physical locations in the human body. Finally, (Collell, Deruyttere, and Moens 2021) have also worked on understanding the implicit spatial relationships of queries and objects.…”
Section: Query Understandingmentioning
confidence: 99%
“…Course layout and fine-grained layout spatial features were exploited in [23] to predict HOI, and the authors argued that the appearance features of objects did not affect the HOI prediction performance. Spatial clues for scene understanding are investigated in [50] and propose canonical spatial representation templates that indicate the power of spatial features in visual relationship applications and outperform many HOI state-of-the-art models.…”
Section: E Spatial Features In Hoimentioning
confidence: 99%
“…Unlike the works that ground the referring expressions in a visual scene, the works of [40], [41] localize the referring expression in the physical geographic regions or the anatomical model of the human body. The work of [42] focuses on capturing implicit spatial relationships between different kinds of objects that appear in the natural language commands. On the other hand, the work of [43] focuses on the task laid out in Touchdown [11], which involves following passenger instructions.…”
Section: B Natural Language Commandsmentioning
confidence: 99%