Sina Zarrieß scite author profile

Research on generating referring expressions has so far mostly focussed on "oneshot reference", where the aim is to generate a single, discriminating expression. In interactive settings, however, it is not uncommon for reference to be established in "installments", where referring information is offered piecewise until success has been confirmed. We show that this strategy can also be advantageous in technical systems that only have uncertain access to object attributes and categories. We train a recently introduced model of grounded word meaning on a data set of REs for objects in images and learn to predict semantically appropriate expressions. In a human evaluation, we observe that users are sensitive to inadequate object names -which unfortunately are not unlikely to be generated from low-level visual input. We propose a solution inspired from human task-oriented interaction and implement strategies for avoiding and repairing semantically inaccurate words. We enhance a word-based REG with contextaware, referential installments and find that they substantially improve the referential success of the system.

show abstract

Tell Me More: A Dataset of Visual Scene Description Sequences

Ilinykh¹,

Zarrieß

Schlangen

2019

View full text Add to dashboard Cite

We present a dataset consisting of what we call image description sequences. These multisentence descriptions of the contents of an image were collected in a pseudo-interactive setting, where the describer was told to describe the given image to a listener who needs to identify the image within a set of images, and who successively asks for more information. As we show, this setup produced nicely structured data that, we think, will be useful for learning models capable of planning and realising such description discourses.

show abstract

Exploiting translational correspondences for pattern-independent MWE identification

Zarrieß

Kuhn

2009

View full text Add to dashboard Cite

Based on a study of verb translations in the Europarl corpus, we argue that a wide range of MWE patterns can be identified in translations that exhibit a correspondence between a single lexical item in the source language and a group of lexical items in the target language. We show that these correspondences can be reliably detected on dependency-parsed, word-aligned sentences. We propose an extraction method that combines word alignment with syntactic filters and is independent of the structural pattern of the translation.

show abstract

Towards Generating Colour Terms for Referents in Photographs: Prefer the Expected or the Unexpected?

Zarrieß¹,

Schlangen²

2016

View full text Add to dashboard Cite

Colour terms have been a prime phenomenon for studying language grounding, though previous work focussed mostly on descriptions of simple objects or colour swatches. This paper investigates whether colour terms can be learned from more realistic and potentially noisy visual inputs, using a corpus of referring expressions to objects represented as regions in real-world images. We obtain promising results from combining a classifier that grounds colour terms in visual input with a recalibration model that adjusts probability distributions over colour terms according to contextual and object-specific preferences.

show abstract

Know What You Don’t Know: Modeling a Pragmatic Speaker that Refers to Objects of Unknown Categories

Zarrieß¹,

Schlangen²

2019

View full text Add to dashboard Cite

Zero-shot learning in Language & Vision is the task of correctly labelling (or naming) objects of novel categories. Another strand of work in L&V aims at pragmatically informative rather than "correct" object descriptions, e.g. in reference games. We combine these lines of research and model zero-shot reference games, where a speaker needs to successfully refer to a novel object in an image. Inspired by models of "rational speech acts", we extend a neural generator to become a pragmatic speaker reasoning about uncertain object categories. As a result of this reasoning, the generator produces fewer nouns and names of distractor categories as compared to a literal speaker. We show that this conversational strategy for dealing with novel objects often improves communicative success, in terms of resolution accuracy of an automatic listener.

show abstract

The Why and The How: A Survey on Natural Language Interaction in Visualization

Voigt¹,

Alaçam²,

Meuschke³

et al. 2022

View full text Add to dashboard Cite

Natural language as a modality of interaction is becoming increasingly popular in the field of visualization. In addition to the popular query interfaces, other language-based interactions such as annotations, recommendations, explanations, or documentation experience growing interest. In this survey, we provide an overview of natural language-based interaction in the research area of visualization. We discuss a renowned taxonomy of visualization tasks and classify 119 related works to illustrate the stateof-the-art of how current natural language interfaces support their performance. We examine applied NLP methods and discuss humanmachine dialogue structures with a focus on initiative, duration, and communicative functions in recent visualization-oriented dialogue interfaces. Based on this overview, we point out interesting areas for the future application of NLP methods in the field of visualization.

show abstract

Decoding Strategies for Neural Referring Expression Generation

Zarrieß

Schlangen

2018

View full text Add to dashboard Cite

RNN-based sequence generation is now widely used in NLP and NLG (natural language generation). Most work focusses on how to train RNNs, even though also decoding is not necessarily straightforward: previous work on neural MT found seq2seq models to radically prefer short candidates, and has proposed a number of beam search heuristics to deal with this. In this work, we assess decoding strategies for referring expression generation with neural models. Here, expression length is crucial: output should neither contain too much or too little information, in order to be pragmatically adequate. We find that most beam search heuristics developed for MT do not generalize well to referring expression generation (REG), and do not generally outperform greedy decoding. We observe that beam search heuristics for termination seem to override the model's knowledge of what a good stopping point is. Therefore, we also explore a recent approach called trainable decoding, which uses a small network to modify the RNN's hidden state for better decoding results. We find this approach to consistently outperform greedy decoding for REG.

show abstract

12 3 4 5 6

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Sina Zarrieß

Resolving References to Objects in Photographs using the Words-As-Classifiers Model

Easy Things First: Installments Improve Referring Expression Generation for Objects in Photographs

Tell Me More: A Dataset of Visual Scene Description Sequences

Exploiting translational correspondences for pattern-independent MWE identification

Towards Generating Colour Terms for Referents in Photographs: Prefer the Expected or the Unexpected?

Know What You Don’t Know: Modeling a Pragmatic Speaker that Refers to Objects of Unknown Categories

The Why and The How: A Survey on Natural Language Interaction in Visualization

Decoding Strategies for Neural Referring Expression Generation

Contact Info

Product

Resources

About