Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL) 2019
DOI: 10.18653/v1/k19-1040
|View full text |Cite
|
Sign up to set email alerts
|

Leveraging Past References for Robust Language Grounding

Abstract: Grounding referring expressions to objects in an environment has traditionally been considered a one-off, ahistorical task. However, in realistic applications of grounding, multiple users will repeatedly refer to the same set of objects. As a result, past referring expressions for objects can provide strong signals for grounding subsequent referring expressions. We therefore reframe the grounding problem from the perspective of coreference detection and propose a neural network that detects when two expression… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 8 publications
(10 citation statements)
references
References 23 publications
(21 reference statements)
0
10
0
Order By: Relevance
“…Incorporating discourse history. Previous work has incorporated discourse history in reference games using explicit co-reference detection (Roy et al, 2019) or contribution tracking (DeVault and Stone, 2009) techniques. An alternative approach is to include embeddings of the history as conditional input to the model at test time (Haber et al, 2019).…”
Section: Related Workmentioning
confidence: 99%
“…Incorporating discourse history. Previous work has incorporated discourse history in reference games using explicit co-reference detection (Roy et al, 2019) or contribution tracking (DeVault and Stone, 2009) techniques. An alternative approach is to include embeddings of the history as conditional input to the model at test time (Haber et al, 2019).…”
Section: Related Workmentioning
confidence: 99%
“…In contrast, we generate the surface realisation of first and subsequent referring utterances end-to-end, grounding them in continuous visual features of real images. Our work is related to a recent line of research on reference resolution in visually-grounded dialogue, where previous mentions have been shown to be useful (Shore and Skantze, 2018;Haber et al, 2019;Roy et al, 2019). Here we focus on generation.…”
Section: Related Workmentioning
confidence: 99%
“…• Non-textual modality: Fusion is applied with images for tasks like referring expressions (Roy et al, 2019), SRL etc., For videos, some tasks are grounding action descriptions (Regneri et al, 2013), spatio-temporal QA (Lei et al, 2020), concept similarity , mapping events (Fleischman and Roy, 2008) etc.,…”
Section: Manipulating Representationsmentioning
confidence: 99%
“…• Non-textual Modality: Multitasking with images is used to perform spoken image captioning (Chrupala, 2019) and grammar induction (Zhao and Titov, 2020). Joint modeling was used in multiresolution language grounding Koncel-Kedziorski et al (2014), identifying referring expressions Roy et al (2019), multimodal MT (Zhou et al, 2018c), video parsing Ross et al (2018), learning latent semantic annotations (Qin et al, 2018) etc.,…”
Section: Learning Objectivementioning
confidence: 99%