Neural Sequence Learning Models for Word Sense Disambiguation

Raganato, Alessandro; Bovi, Claudio Delli; Navigli, Roberto

doi:10.18653/v1/d17-1120

Cited by 171 publications

(182 citation statements)

References 48 publications

Supporting

Mentioning

181

Contrasting

Order By: Relevance

“…Although word expert supervised WSD methods perform better, they are less flexible than knowledge-based methods in the all-words WSD task (Raganato et al, 2017a). Recent neural-based methods are devoted to dealing with this problem.…”

Section: Traditionalmentioning

confidence: 99%

GlossBERT: BERT for Word Sense Disambiguation with Gloss Knowledge

Huang

Sun

Qiu

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

165

170

View full text Add to dashboard Cite

Word Sense Disambiguation (WSD) aims to find the exact sense of an ambiguous word in a particular context. Traditional supervised methods rarely take into consideration the lexical resources like WordNet, which are widely utilized in knowledge-based methods. Recent studies have shown the effectiveness of incorporating gloss (sense definition) into neural networks for WSD. However, compared with traditional word expert supervised methods, they have not achieved much improvement. In this paper, we focus on how to better leverage gloss knowledge in a supervised neural WSD system. We construct context-gloss pairs and propose three BERT-based models for WSD. We fine-tune the pre-trained BERT model and achieve new state-of-the-art results on WSD task 1 .

show abstract

Section: Traditionalmentioning

confidence: 99%

GlossBERT: BERT for Word Sense Disambiguation with Gloss Knowledge

Huang

Sun

Qiu

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

165

170

View full text Add to dashboard Cite

show abstract

“…Luo et al (2018a) and Luo et al (2018b) combine information from glosses present in WordNet, with NLMs based on BiLSTMs, through memory networks and co-attention mechanisms, respectively. Vial et al (2018) follows Raganato et al (2017b)'s BiLSTM method, but leverages the semantic network to strategically reduce the set of senses required for disambiguating words.…”

Section: Wsd State-of-the-artmentioning

confidence: 99%

Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation

Loureiro

Jorge

2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

104

125

View full text Add to dashboard Cite

Contextual embeddings represent a new generation of semantic representations learned from Neural Language Modelling (NLM) that addresses the issue of meaning conflation hampering traditional word embeddings. In this work, we show that contextual embeddings can be used to achieve unprecedented gains in Word Sense Disambiguation (WSD) tasks. Our approach focuses on creating sense-level embeddings with full-coverage of WordNet, and without recourse to explicit knowledge of sense distributions or task-specific modelling. As a result, a simple Nearest Neighbors (k-NN) method using our representations is able to consistently surpass the performance of previous systems using powerful neural sequencing models. We also analyse the robustness of our approach when ignoring part-of-speech and lemma features, requiring disambiguation against the full sense inventory, and revealing shortcomings to be improved. Finally, we explore applications of our sense embeddings for concept-level analyses of contextual embeddings and their respective NLMs.

show abstract

“…Apart from that, tt can easily be seen that the Seq2Seq architectures perform very well against the statistical and knowledge-based methods like IMS + adapted CW [22], Htsa3 [23], UKBgloss w2w [27], Babelfy [28] as well as RST -kernels [24] achieving results that are superior or equivalent to the best models as mentioned above. One interesting evaluation is that the Seq2Seq baseline from [18] is 69.6% and our Seq2Seq baseline performance is 66.3%. When [18] added POS, where it is meant to be learned as one of their tasks, their performance degrades from 69.6% to 68.5%.…”

Section: Resultsmentioning

confidence: 99%

“…One interesting evaluation is that the Seq2Seq baseline from [18] is 69.6% and our Seq2Seq baseline performance is 66.3%. When [18] added POS, where it is meant to be learned as one of their tasks, their performance degrades from 69.6% to 68.5%. When we added POS as a feature, our performance jumped from 66.3% to 67.5% which clearly shows that adding POS information as a feature has an influence on identifying polysemy of a word.…”

Section: Resultsmentioning

confidence: 99%

A Novel Neural Sequence Model with Multiple Attentions for Word Sense Disambiguation

Ahmed

Samee

Mercer

2018

2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)

View full text Add to dashboard Cite

Word sense disambiguation (WSD) is a well researched problem in computational linguistics. Different research works have approached this problem in different ways. Some state of the art results that have been achieved for this problem are by supervised models in terms of accuracy, but they often fall behind flexible knowledge-based solutions which use engineered features as well as human annotators to disambiguate every target word. This work focuses on bridging this gap using neural sequence models incorporating the well-known attention mechanism. The main gist of our work is to combine multiple attentions on different linguistic features through weights and to provide a unified framework for doing this. This weighted attention allows the model to easily disambiguate the sense of an ambiguous word by attending over a suitable portion of a sentence. Our extensive experiments show that multiple attention enables a more versatile encoder-decoder model leading to state of the art results.

show abstract

Neural Sequence Learning Models for Word Sense Disambiguation

Cited by 171 publications

References 48 publications

GlossBERT: BERT for Word Sense Disambiguation with Gloss Knowledge

GlossBERT: BERT for Word Sense Disambiguation with Gloss Knowledge

Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation

A Novel Neural Sequence Model with Multiple Attentions for Word Sense Disambiguation

Contact Info

Product

Resources

About