Word Sense Disambiguation: A Unified Evaluation Framework and
            Empirical Comparison

Raganato, Alessandro; Camacho-Collados, José; Navigli, Roberto

doi:10.18653/v1/e17-1010

Cited by 249 publications

(317 citation statements)

References 34 publications

Supporting

Mentioning

298

Contrasting

Unclassified

Order By: Relevance

“…It is worth noting that RNN-based architectures outperformed classical supervised approaches (Zhong and Ng, 2010;Iacobacci et al, 2016) when dealing with verbs, which are shown to be highly ambiguous (Raganato et al, 2017). The performance on coarse-grained WSD followed the same trend (Table 2).…”

Section: Resultsmentioning

confidence: 64%

“…Architecture Details. To set a level playing field with comparison systems on English all-words WSD, we followed Raganato et al (2017) and, for all our models, we used a layer of word embeddings pre-trained 8 on the English ukWaC corpus (Baroni et al, 2009) as initialization, and kept them fixed during the training process. For all architectures we then employed 2 layers of bidirectional LSTM with 2048 hidden units (1024 units per direction).…”

Section: Methodsmentioning

confidence: 99%

“…Moreover, the field has been explored in depth from different angles by means of extensive empirical studies and evaluation frameworks (Pilehvar and Navigli, 2014;Iacobacci et al, 2016;McCarthy et al, 2016;Raganato et al, 2017).…”

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Neural Sequence Learning Models for Word Sense Disambiguation

Raganato¹,

Bovi²,

Navigli³

2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Self Cite

171

180

View full text Add to dashboard Cite

Word Sense Disambiguation models exist in many flavors. Even though supervised ones tend to perform best in terms of accuracy, they often lose ground to more flexible knowledge-based solutions, which do not require training by a word expert for every disambiguation target. To bridge this gap we adopt a different perspective and rely on sequence learning to frame the disambiguation problem: we propose and study in depth a series of end-to-end neural architectures directly tailored to the task, from bidirectional Long Short-Term Memory to encoder-decoder models. Our extensive evaluation over standard benchmarks and in multiple languages shows that sequence learning enables more versatile all-words models that consistently lead to state-of-the-art results, even against word experts with engineered features.

show abstract

Section: Resultsmentioning

confidence: 64%

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Neural Sequence Learning Models for Word Sense Disambiguation

Raganato¹,

Bovi²,

Navigli³

2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Self Cite

171

180

View full text Add to dashboard Cite

show abstract

“…Gold standard datasets We performed our evaluations using the framework made available by Raganato et al (2017a) on five different allwords datasets, namely: the Senseval-2 (Edmonds and Cotton, 2001), Senseval-3 (Snyder and Palmer, 2004), SemEval-2007(Pradhan et al, 2007, SemEval-2013 and SemEval-2015 (Moro and WSD datasets. We focused on nouns only, given the fact that Wikipedia provides connections between nominal synsets only, and therefore contributes mainly to syntagmatic relations between nouns.…”

Section: Semantic Networkmentioning

confidence: 99%

Train-O-Matic: Large-Scale Supervised Word Sense Disambiguation in Multiple Languages without Manual Training Data

Pasini

Navigli

2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Self Cite

View full text Add to dashboard Cite

Annotating large numbers of sentences with senses is the heaviest requirement of current Word Sense Disambiguation. We present Train-O-Matic, a languageindependent method for generating millions of sense-annotated training instances for virtually all meanings of words in a language's vocabulary. The approach is fully automatic: no human intervention is required and the only type of human knowledge used is a WordNet-like resource. Train-O-Matic achieves consistently state-of-the-art performance across gold standard datasets and languages, while at the same time removing the burden of manual annotation. All the training data is available for research purposes at http://trainomatic.org.

show abstract

“…Given that scaling the manual annotation process becomes practically unfeasible when both lexicographic and encyclopedic knowledge is addressed (Schubert, 2006), recent years have witnessed efforts to produce larger sense-annotated corpora automatically (Moro et al, 2014a;Taghipour and Ng, 2015a;Scozzafava et al, 2015;Raganato et al, 2016). Even though these automatic approaches produce noisier corpora, it has been shown that training on them leads to better supervised and semi-supervised models (Taghipour and Ng, 2015b;Raganato et al, 2016;Yuan et al, 2016;Raganato et al, 2017), as well as to effective embedded representations for senses (Iacobacci et al, 2015;Flekova and Gurevych, 2016).…”

Section: Introductionmentioning

confidence: 99%

EuroSense: Automatic Harvesting of Multilingual Sense Annotations from Parallel Text

Bovi

Camacho-Collados

Raganato

et al. 2017

Proceedings of the 55th Annual Meeting of the Association For Computational Linguistics (Volume 2: Short Papers)

Self Cite

View full text Add to dashboard Cite

Parallel corpora are widely used in a variety of Natural Language Processing tasks, from Machine Translation to cross-lingual Word Sense Disambiguation, where parallel sentences can be exploited to automatically generate high-quality sense annotations on a large scale. In this paper we present EUROSENSE, a multilingual sense-annotated resource based on the joint disambiguation of the Europarl parallel corpus, with almost 123 million sense annotations for over 155 thousand distinct concepts and entities from a languageindependent unified sense inventory. We evaluate the quality of our sense annotations intrinsically and extrinsically, showing their effectiveness as training data for Word Sense Disambiguation.

show abstract

Word Sense Disambiguation: A Unified Evaluation Framework and Empirical Comparison

Cited by 249 publications

References 34 publications

Neural Sequence Learning Models for Word Sense Disambiguation

Neural Sequence Learning Models for Word Sense Disambiguation

Train-O-Matic: Large-Scale Supervised Word Sense Disambiguation in Multiple Languages without Manual Training Data

EuroSense: Automatic Harvesting of Multilingual Sense Annotations from Parallel Text

Contact Info

Product

Resources

About