Thomas Kober scite author profile

This paper aims to re-think the role of the word similarity task in distributional semantics research. We argue while it is a valuable tool, it should be used with care because it provides only an approximate measure of the quality of a distributional model. Word similarity evaluations assume there exists a single notion of similarity that is independent of a particular application. Further, the small size and low inter-annotator agreement of existing data sets makes it challenging to find significant differences between models.

show abstract

Aligning Packed Dependency Trees: A Theory of Composition for Distributional Semantics

Weir

Weeds

Reffin

et al. 2016

Computational Linguistics

View full text Add to dashboard Cite

We present a new framework for compositional distributional semantics in which the distributional contexts of lexemes are expressed in terms of anchored packed dependency trees. We show that these structures have the potential to capture the full sentential contexts of a lexeme and provide a uniform basis for the composition of distributional knowledge in a way that captures both mutual disambiguation and generalization.

show abstract

Data Augmentation for Hypernymy Detection

Kober

Weeds

Bertolini

et al. 2021

View full text Add to dashboard Cite

The automatic detection of hypernymy relationships represents a challenging problem in NLP. The successful application of stateof-the-art supervised approaches using distributed representations has generally been impeded by the limited availability of high quality training data. We have developed two novel data augmentation techniques which generate new training examples from existing ones. First, we combine the linguistic principles of hypernym transitivity and intersective modifier-noun composition to generate additional pairs of vectors, such as small dogdog or small dog -animal, for which a hypernymy relationship can be assumed. Second, we use generative adversarial networks (GANs) to generate pairs of vectors for which the hypernymy relation can also be assumed. We furthermore present two complementary strategies for extending an existing dataset by leveraging linguistic resources such as Word-Net. Using an evaluation across 3 different datasets for hypernymy detection and 2 different vector spaces, we demonstrate that both of the proposed automatic data augmentation and dataset extension strategies substantially improve classifier performance.

show abstract

Improving Sparse Word Representations with Distributional Inference for Semantic Composition

Kober¹,

Weeds²,

Reffin³

et al. 2016

View full text Add to dashboard Cite

Distributional models are derived from cooccurrences in a corpus, where only a small proportion of all possible plausible cooccurrences will be observed. This results in a very sparse vector space, requiring a mechanism for inferring missing knowledge. Most methods face this challenge in ways that render the resulting word representations uninterpretable, with the consequence that semantic composition becomes hard to model. In this paper we explore an alternative which involves explicitly inferring unobserved co-occurrences using the distributional neighbourhood. We show that distributional inference improves sparse word representations on several word similarity benchmarks and demonstrate that our model is competitive with the state-of-the-art for adjectivenoun, noun-noun and verb-object compositions while being fully interpretable.

show abstract

One Representation per Word - Does it make Sense for Composition?

Kober¹,

Weeds²,

Wilkie³

et al. 2017

View full text Add to dashboard Cite

In this paper, we investigate whether an a priori disambiguation of word senses is strictly necessary or whether the meaning of a word in context can be disambiguated through composition alone. We evaluate the performance of off-the-shelf singlevector and multi-sense vector models on a benchmark phrase similarity task and a novel task for word-sense discrimination. We find that single-sense vector models perform as well or better than multi-sense vector models despite arguably less clean elementary representations. Our findings furthermore show that simple composition functions such as pointwise addition are able to recover sense specific information from a single-sense vector model remarkably well.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Thomas Kober

A critique of word similarity as a method for evaluating distributional semantic models

Aligning Packed Dependency Trees: A Theory of Composition for Distributional Semantics

Data Augmentation for Hypernymy Detection

Improving Sparse Word Representations with Distributional Inference for Semantic Composition

One Representation per Word - Does it make Sense for Composition?

Contact Info

Product

Resources

About