Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu 2018
DOI: 10.18653/v1/n18-2124
|View full text |Cite
|
Sign up to set email alerts
|

Visually Guided Spatial Relation Extraction from Text

Abstract: Extraction of spatial relations from sentences with complex/nesting relationships is very challenging as often needs resolving inherent semantic ambiguities. We seek help from visual modality to fill the information gap in the text modality and resolve spatial semantic ambiguities. We use various recent vision and language datasets and techniques to train inter-modality alignment models, visual relationship classifiers and propose a novel global inference model to integrate these components into our structured… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
2

Relationship

3
3

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 21 publications
0
7
0
Order By: Relevance
“…The code is publicly available here 2 . We compare our approach with the state-of-theart (Rahgooy et al, 2018). However, in the mentioned paper, the authors use visual data from the accompanying images to improve the models.…”
Section: Resultsmentioning
confidence: 99%
“…The code is publicly available here 2 . We compare our approach with the state-of-theart (Rahgooy et al, 2018). However, in the mentioned paper, the authors use visual data from the accompanying images to improve the models.…”
Section: Resultsmentioning
confidence: 99%
“…Spatial semantics is very closely connected and relevant to visualization of natural language and grounding language into perception, central to dealing with configurations in the physical world and motivating a combination of vision and language for richer spatial understanding. The related tasks include: text-to-scene conversion; image captioning; spatial and visual question answering; and spatial understanding in multimodal settings (Rahgooy et al, 2018) for robotics and navigation tasks and language grounding (Thomason et al, 2018).…”
Section: Descriptionmentioning
confidence: 99%
“…This function is a linear discriminant function defined over combined feature representation of inputs and outputs denoted by f (x, y). However, in this work, independent classifiers are trained per role and relations and only the predication is performed based on the global inference as in (Kordjamshidi et al, 2017a;Rahgooy et al, 2018) .…”
Section: Learning Modelmentioning
confidence: 99%
“…The global constraints used in our proposed model is combination of previously proposed constraints (1-7) (Rahgooy et al, 2018) and new one (constraint 8) described in Table 3.3. In fact, the global inference is performed using integer linear programming techniques subject to these constraints.…”
Section: Constraintsmentioning
confidence: 99%
See 1 more Smart Citation