Towards Generating Colour Terms for Referents in Photographs: Prefer
            the Expected or the Unexpected?

Zarrieß, Sina; Schlangen, David

doi:10.18653/v1/w16-6642

Cited by 7 publications

(11 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Similarly, distributional similarities between colors seem to be misleading rather than helpful, cf. (Zarrieß and Schlangen, 2016b) for a study on color adjectives on the same corpus. This effect seems to be related to findings on antonyms in distributional modeling (Nguyen et al, 2016).…”

Section: Resultsmentioning

confidence: 99%

Is this a Child, a Girl or a Car? Exploring the Contribution of Distributional Similarity to Learning Referential Word Meanings

Zarrieß

Schlangen

2017

Proceedings of the 15th Conference of the European Chapter of The Association for Computational Linguistics: Volume 2

Self Cite

View full text Add to dashboard Cite

There has recently been a lot of work trying to use images of referents of words for improving vector space meaning representations derived from text. We investigate the opposite direction, as it were, trying to improve visual word predictors that identify objects in images, by exploiting distributional similarity information during training. We show that for certain words (such as entry-level nouns or hypernyms), we can indeed learn better referential word meanings by taking into account their semantic similarity to other words. For other words, there is no or even a detrimental effect, compared to a learning setup that presents even semantically related objects as negative instances.

show abstract

Section: Resultsmentioning

confidence: 99%

Is this a Child, a Girl or a Car? Exploring the Contribution of Distributional Similarity to Learning Referential Word Meanings

Zarrieß

Schlangen

2017

Proceedings of the 15th Conference of the European Chapter of The Association for Computational Linguistics: Volume 2

Self Cite

View full text Add to dashboard Cite

show abstract

“…The dialogue system's task is to generate expressions referring to objects in real-world images, intending to identify these objects to a human listener. The system's underlying generation component predicts words directly from low-level visual input representations of the target object defined via a bounding box in the image, based on the approach in [6,7]. As illustrated in Figure 1, this can lead to partially defective utterances being generated, due to imperfect visual language grounding [7].…”

Section: Introductionmentioning

confidence: 99%

“…Thus, it has been argued, that dialogue systems interacting with users in real-world environments need principled communicative mechanisms for dealing with uncertainties, perceptual mismatches, and potential misunderstanding [14,15,16,17]. In this study, we look at (potentially less disturbing) defective color terms, which turn out to be hard to predict for objects in real-world images as well [6]. Thus, for compiling the materials of our experiment, we used color terms predicted for objects in images by [6]'s model which we identified as defective based on annotated color terms in the training set of the model.…”

Section: Introductionmentioning

confidence: 99%

“…In this study, we look at (potentially less disturbing) defective color terms, which turn out to be hard to predict for objects in real-world images as well [6]. Thus, for compiling the materials of our experiment, we used color terms predicted for objects in images by [6]'s model which we identified as defective based on annotated color terms in the training set of the model. This allows us to investigate whether conversational synthesis can lead to processing advantages for the listener for realistic, but controlled errors in system output.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Do Hesitations Facilitate Processing of Partially Defective System Utterances? An Exploratory Eye Tracking Study

Haake¹,

Schimke²,

Betz

et al. 2019

Interspeech 2019

Self Cite

View full text Add to dashboard Cite

Spoken dialogue systems are predominantly evaluated using offline methods such as user ratings or task-oriented measures. Various phenomena in conversational speech, however, are known to affect the way the listener's comprehension unfolds over time, and not necessarily the final result of the comprehension process. For instance, in human reference comprehension, conversational signals like hesitations have been shown to ease processing of expressions referring to difficult-to-describe targets, as can primarily be observed in listeners' anticipatory eye movements rather than in their final reference resolution decision. In this study, we explore eye tracking for testing conversational dialogue systems, looking at how listeners process automatically generated referring expressions containing defective attributes. We investigate whether hesitations facilitate the processing of partially defective system utterances and track the user's eye movements when listening to expressions with: (i) semantically defective but fluently synthesized adjectives, (ii) defective and lengthened adjectives, i.e. containing a conversational uncertainty signal. Our results are encouraging: whereas the offline measure of task success does not show any differences between the two conditions, the listeners' eye movements suggest that processing of partially defective utterances might be facilitated by conversational hesitations.

show abstract

“…While there have been previous approaches to generating referring expressions (REs) under uncertainty, those algorithms have been explicitly designed to refer to objects in visual scenes, and as such are tightly integrated with visual classifiers (Zarrieß and Schlangen, 2016;Roy, 2002;Meo et al, 2014). This is problematic for least two reasons: First, intelligent agents may need to generate REs for a much wider class of entities than those appearing in a visual scene (e.g., agents, locations, ideas, utterances), which may not be possible if an REG algorithm is tightly coupled with visual classifiers.…”

Section: Introductionmentioning

confidence: 99%

Referring Expression Generation under Uncertainty: Algorithm and Evaluation Framework

Williams¹,

Scheutz²

2017

Proceedings of the 10th International Conference on Natural Language Generation

View full text Add to dashboard Cite

For situated agents to effectively engage in natural-language interactions with humans, they must be able to refer to entities such as people, locations, and objects. While classic referring expression generation (REG) algorithms like the Incremental Algorithm (IA) assume perfect, complete, and accessible knowledge of all referents, this is not always possible. In this work, we show how a previously presented consultant framework (which facilitates reference resolution when knowledge is uncertain, heterogeneous and distributed) can be used to extend the IA to produce DIST-PIA, a domain-independent algorithm for REG under uncertain, heterogeneous, and distributed knowledge. We also present a novel framework that can be used to evaluate such REG algorithms without conflating the performance of the algorithm with the performance of classifiers it employs.

show abstract

Towards Generating Colour Terms for Referents in Photographs: Prefer the Expected or the Unexpected?

Cited by 7 publications

References 16 publications

Is this a Child, a Girl or a Car? Exploring the Contribution of Distributional Similarity to Learning Referential Word Meanings

Is this a Child, a Girl or a Car? Exploring the Contribution of Distributional Similarity to Learning Referential Word Meanings

Do Hesitations Facilitate Processing of Partially Defective System Utterances? An Exploratory Eye Tracking Study

Referring Expression Generation under Uncertainty: Algorithm and Evaluation Framework

Contact Info

Product

Resources

About