Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics - 1999
DOI: 10.3115/1034678.1034693
|View full text |Cite
|
Sign up to set email alerts
|

Measures of distributional similarity

Abstract: We study distributional similarity measures for the purpose of improving probability estimation for unseen cooccurrences. Our contributions are three-fold: an empirical comparison of a broad range of measures; a classification of similarity functions based on the information that they incorporate; and the introduction of a novel function that is superior at evaluating potential proxy distributions.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
296
0
6

Year Published

2001
2001
2017
2017

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 381 publications
(308 citation statements)
references
References 31 publications
1
296
0
6
Order By: Relevance
“…The problem arises when the probability of word combinations that do not occur in the training data needs to be estimated. The smoothing methods proposed in the literature (overviews are provided by Dagan et al (1999) and Lee (1999)) can be generally divided into three types: discounting (Katz, 1987), class-based smoothing (Resnik, 1993;Brown et al, 1992;Pereira et al, 1993), and distance-weighted averaging (Grishman and Sterling, 1994;Dagan et al, 1999).…”
Section: Smoothing Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The problem arises when the probability of word combinations that do not occur in the training data needs to be estimated. The smoothing methods proposed in the literature (overviews are provided by Dagan et al (1999) and Lee (1999)) can be generally divided into three types: discounting (Katz, 1987), class-based smoothing (Resnik, 1993;Brown et al, 1992;Pereira et al, 1993), and distance-weighted averaging (Grishman and Sterling, 1994;Dagan et al, 1999).…”
Section: Smoothing Methodsmentioning
confidence: 99%
“…A key feature of this type of smoothing is the function which measures distributional similarity from cooccurrence frequencies. Several measures of distributional similarity have been proposed in the literature (Dagan et al, 1999;Lee, 1999). We used two measures, the Jensen-Shannon divergence and the confusion probability.…”
Section: Distance-weighted Averagingmentioning
confidence: 99%
“…That can be useful in real WSD. Others who have worked on variations of PWSD include Gale et al (1992); Schütze (1998); Lee (1999); Dagan et al (1999); Rooth et al (1999); Clark and Weir (2002); Weeds and Weir (2005); ZhitomirskyGeffet and Dagan (2009). The methodology we followed was similar to that of Weeds and Weir.…”
Section: Pseudo-word-sense Disambiguationmentioning
confidence: 99%
“…The motivation behind this design is to cover all the types of OWL hierarchical set relations, such as subclass, union, and intersection. Furthermore, this pattern relies on theoretical foundations, including Jaccard similarity coefficient [27], similarity in semantic networks by Rada et al [28], feature-based similarity in description logic by Borgida et al [29], and general cognitive theories about similarity by Tversky [30].…”
Section: Set Hierarchy Patternmentioning
confidence: 99%