Abhijeet Gupta scite author profile

Distributional methods have proven to excel at capturing fuzzy, graded aspects of meaning (Italy is more similar to Spain than to Germany). In contrast, it is difficult to extract the values of more specific attributes of word referents from distributional representations, attributes of the kind typically found in structured knowledge bases (Italy has 60 million inhabitants). In this paper, we pursue the hypothesis that distributional vectors also implicitly encode referential attributes. We show that a standard supervised regression model is in fact sufficient to retrieve such attributes to a reasonable degree of accuracy: When evaluated on the prediction of both categorical and numeric attributes of countries and cities, the model consistently reduces baseline error by 30%, and is not far from the upper bound. Further analysis suggests that our model is able to "objectify" distributional representations for entities, anchoring them more firmly in the external world in measurable ways.

Instances and concepts in distributional space

Boleda

Padó

2017

Instances ("Mozart") are ontologically distinct from concepts or classes ("composer"). Natural language encompasses both, but instances have received comparatively little attention in distributional semantics. Our results show that instances and concepts differ in their distributional properties. We also establish that instantiation detection ("Mozart -composer") is generally easier than hypernymy detection ("chemist -scientist"), and that results on the influence of input representation do not transfer from hyponymy to instantiation.

Distributed Prediction of Relations for Entities: The Easy, The Difficult, and The Impossible

Boleda

Padó

2017

Word embeddings are supposed to provide easy access to semantic relations such as "male of" (man-woman). While this claim has been investigated for concepts, little is known about the distributional behavior of relations of (Named) Entities. We describe two word embedding-based models that predict values for relational attributes of entities, and analyse them. The task is challenging, with major performance differences between relations. Contrary to many NLP tasks, high difficulty for a relation does not result from low frequency, but from (a) one-to-many mappings; and (b) lack of context patterns expressing the relation that are easy to pick up by word embeddings.

A Novel Approach Towards Building a Portable NLIDB System Using the Computational Paninian Grammar Framework

Akula

Malladi

et al. 2012

Dissecting the Practical Lexical Function Model for Compositional Distributional Semantics

Utt

Padó

2015

The Practical Lexical Function model (PLF) is a recently proposed compositional distributional semantic model which provides an elegant account of composition, striking a balance between expressiveness and robustness and performing at the state-of-the-art. In this paper, we identify an inconsistency in PLF between the objective function at training and the prediction at testing which leads to an overcounting of the predicate's contribution to the meaning of the phrase. We investigate two possible solutions of which one (the exclusion of simple lexical vector at test time) improves performance significantly on two out of the three composition datasets.