Aurélie Herbelot scite author profile

In this paper, we aim to understand whether current language and vision (LaVi) models truly grasp the interaction between the two modalities. To this end, we propose an extension of the MS-COCO dataset, FOIL-COCO, which associates images with both correct and 'foil' captions, that is, descriptions of the image that are highly similar to the original ones, but contain one single mistake ('foil word'). We show that current LaVi models fall into the traps of this data and perform badly on three tasks: a) caption classification (correct vs. foil); b) foil word detection; c) foil word correction. Humans, in contrast, have near-perfect performance on those tasks. We demonstrate that merely utilising language cues is not enough to model FOIL-COCO and that it challenges the state-of-the-art by requiring a fine-grained understanding of the relation between text and image.

show abstract

High-risk learning: acquiring new word vectors from tiny data

Herbelot¹,

Baroni²

2017

106

View full text Add to dashboard Cite

Distributional semantics models are known to struggle with small data. It is generally accepted that in order to learn 'a good vector' for a word, a model must have sufficient examples of its usage. This contradicts the fact that humans can guess the meaning of a word from a few occurrences only. In this paper, we show that a neural language model such as Word2Vec only necessitates minor modifications to its standard architecture to learn new terms from tiny data, using background knowledge from a previously learnt semantic space. We test our model on word definitions and on a nonce task involving 2-6 sentences' worth of context, showing a large increase in performance over state-of-the-art models on the definitional task.

show abstract

Building a shared world: mapping distributional to model-theoretic semantic spaces

Herbelot

Vecchi

2015

View full text Add to dashboard Cite

In this paper, we introduce an approach to automatically map a standard distributional semantic space onto a set-theoretic model. We predict that there is a functional relationship between distributional information and vectorial concept representations in which dimensions are predicates and weights are generalised quantifiers. In order to test our prediction, we learn a model of such relationship over a publicly available dataset of feature norms annotated with natural language quantifiers. Our initial experimental results show that, at least for domain-specific data, we can indeed map between formalisms, and generate high-quality vector representations which encapsulate set overlap information. We further investigate the generation of natural language quantifiers from such vectors.

show abstract

Formal Distributional Semantics: Introduction to the Special Issue

Boleda

Herbelot

2016

Computational Linguistics

View full text Add to dashboard Cite

Formal Semantics and Distributional Semantics are two very influential semantic frameworks in Computational Linguistics. Formal Semantics is based on a symbolic tradition and centered around the inferential properties of language. Distributional Semantics is statistical and data-driven, and focuses on aspects of meaning related to descriptive content. The two frameworks are complementary in their strengths, and this has motivated interest in combining them into an overarching semantic framework: a “Formal Distributional Semantics.” Given the fundamentally different natures of the two paradigms, however, building an integrative framework poses significant theoretical and engineering challenges. The present issue of Computational Linguistics advances the state of the art in Formal Distributional Semantics; this introductory article explains the motivation behind it and summarizes the contributions of previous work on the topic, providing the necessary background for the articles that follow.

show abstract

“Look, some Green Circles!”: Learning to Quantify from Images

Sorodoc¹,

Lazaridou²,

Boleda³

et al. 2016

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.