Easy Things First: Installments Improve Referring Expression Generation for Objects in Photographs

Zarrieß, Sina; Schlangen, David

doi:10.18653/v1/p16-1058

Cited by 14 publications

(30 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Additional evaluation metrics, such as success rates in a human evaluation (cf. Zarrieß and Schlangen (2016)), would be an interesting direction for more detailed investigation here.…”

Section: Word Similarities Many Of the Examples Inmentioning

confidence: 98%

“…These features are then associated in a learning process with certain words, resulting in an association of colour features with colour words, spatial features with prepositions, etc., and based on this, these words can be interpreted with reference to the scene currently presented to the video feed. Whereas Roy's work still looked at relatively simple scenes with graphical objects, research on REG has recently started to investigate set-ups based on real-world images (Kazemzadeh et al, 2014;Gkatzia et al, 2015;Zarrieß and Schlangen, 2016;Mao et al, 2015). Importantly, the lowlevel visual features that can be extracted from these scenes correspond less directly to particular word classes.…”

Section: Related Workmentioning

confidence: 99%

“…Moreover, the visual scenes contain many different types of objects, which poses new challenges for REG. For instance, Zarrieß and Schlangen (2016) find that semantic errors related to mismatches between nouns (e.g. the system generates tree vs. man) are particularly disturbing for users.…”

Section: Related Workmentioning

confidence: 99%

“…the system generates tree vs. man) are particularly disturbing for users. Whereas Zarrieß and Schlangen (2016) propose a strategy to avoid object names when the systems confidence is low, we focus on improving the generation of object names, using distributional knowledge as an additional source. Similarly, Ordonez et al (2016) have studied the problem of deriving appropriate object names, or so-called entry-level categories, from the output of an object recognizer.…”

Section: Related Workmentioning

confidence: 99%

See 3 more Smart Citations

Obtaining referential word meanings from visual and distributional information: Experiments on object naming

Zarrieß¹,

Schlangen²

2017

Proceedings of the 55th Annual Meeting of the Association For Computational Linguistics (Volume 1: Long Papers)

Self Cite

View full text Add to dashboard Cite

We investigate object naming, which is an important sub-task of referring expression generation on real-world images. As opposed to mutually exclusive labels used in object recognition, object names are more flexible, subject to communicative preferences and semantically related to each other. Therefore, we investigate models of referential word meaning that link visual to lexical information which we assume to be given through distributional word embeddings. We present a model that learns individual predictors for object names that link visual and distributional aspects of word meaning during training. We show that this is particularly beneficial for zero-shot learning, as compared to projecting visual objects directly into the distributional space. In a standard object naming task, we find that different ways of combining lexical and visual information achieve very similar performance, though experiments on model combination suggest that they capture complementary aspects of referential meaning.

show abstract

“…Additional evaluation metrics, such as success rates in a human evaluation (cf. Zarrieß and Schlangen (2016)), would be an interesting direction for more detailed investigation here.…”

Section: Word Similarities Many Of the Examples Inmentioning

confidence: 98%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Obtaining referential word meanings from visual and distributional information: Experiments on object naming

Zarrieß¹,

Schlangen²

2017

Proceedings of the 55th Annual Meeting of the Association For Computational Linguistics (Volume 1: Long Papers)

Self Cite

View full text Add to dashboard Cite

show abstract

“…To factor out effects of compositionality and context that arise in reference generation or resolution, we measure how well a predictor for a word w is able to retrieve from a sampled test set objects that have been referred to by w (Schlangen et al, 2016;Zarrieß and Schlangen, 2016a) evaluate on full referring expressions).…”

Section: Experimental Set-upmentioning

confidence: 99%

Is this a Child, a Girl or a Car? Exploring the Contribution of Distributional Similarity to Learning Referential Word Meanings

Zarrieß

Schlangen

2017

Proceedings of the 15th Conference of the European Chapter of The Association for Computational Linguistics: Volume 2

Self Cite

View full text Add to dashboard Cite

There has recently been a lot of work trying to use images of referents of words for improving vector space meaning representations derived from text. We investigate the opposite direction, as it were, trying to improve visual word predictors that identify objects in images, by exploiting distributional similarity information during training. We show that for certain words (such as entry-level nouns or hypernyms), we can indeed learn better referential word meanings by taking into account their semantic similarity to other words. For other words, there is no or even a detrimental effect, compared to a learning setup that presents even semantically related objects as negative instances.

show abstract

Towards efficient human–machine collaboration: effects of gaze-driven feedback and engagement on performance

et al. 2018

View full text Add to dashboard Cite

Referential success is crucial for collaborative task-solving in shared environments. In face-to-face interactions, humans, therefore, exploit speech, gesture, and gaze to identify a specific object. We investigate if and how the gaze behavior of a human interaction partner can be used by a gaze-aware assistance system to improve referential success. Specifically, our system describes objects in the real world to a human listener using on-the-fly speech generation. It continuously interprets listener gaze and implements alternative strategies to react to this implicit feedback. We used this system to investigate an optimal strategy for task performance: providing an unambiguous, longer instruction right from the beginning, or starting with a shorter, yet ambiguous instruction. Further, the system provides gaze-driven feedback, which could be either underspecified (“No, not that one!”) or contrastive (“Further left!”). As expected, our results show that ambiguous instructions followed by underspecified feedback are not beneficial for task performance, whereas contrastive feedback results in faster interactions. Interestingly, this approach even outperforms unambiguous instructions (manipulation between subjects). However, when the system alternates between underspecified and contrastive feedback to initially ambiguous descriptions in an interleaved manner (within subjects), task performance is similar for both approaches. This suggests that listeners engage more intensely with the system when they can expect it to be cooperative. This, rather than the actual informativity of the spoken feedback, may determine the efficiency of information uptake and performance.

show abstract

Easy Things First: Installments Improve Referring Expression Generation for Objects in Photographs

Cited by 14 publications

References 31 publications

Obtaining referential word meanings from visual and distributional information: Experiments on object naming

Obtaining referential word meanings from visual and distributional information: Experiments on object naming

Is this a Child, a Girl or a Car? Exploring the Contribution of Distributional Similarity to Learning Referential Word Meanings

Towards efficient human–machine collaboration: effects of gaze-driven feedback and engagement on performance

Contact Info

Product

Resources

About