Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing 2017
DOI: 10.18653/v1/d17-1120
|View full text |Cite
|
Sign up to set email alerts
|

Neural Sequence Learning Models for Word Sense Disambiguation

Abstract: Word Sense Disambiguation models exist in many flavors. Even though supervised ones tend to perform best in terms of accuracy, they often lose ground to more flexible knowledge-based solutions, which do not require training by a word expert for every disambiguation target. To bridge this gap we adopt a different perspective and rely on sequence learning to frame the disambiguation problem: we propose and study in depth a series of end-to-end neural architectures directly tailored to the task, from bidirectiona… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
181
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 171 publications
(182 citation statements)
references
References 48 publications
1
181
0
Order By: Relevance
“…Although word expert supervised WSD methods perform better, they are less flexible than knowledge-based methods in the all-words WSD task (Raganato et al, 2017a). Recent neural-based methods are devoted to dealing with this problem.…”
Section: Traditionalmentioning
confidence: 99%
“…Although word expert supervised WSD methods perform better, they are less flexible than knowledge-based methods in the all-words WSD task (Raganato et al, 2017a). Recent neural-based methods are devoted to dealing with this problem.…”
Section: Traditionalmentioning
confidence: 99%
“…Luo et al (2018a) and Luo et al (2018b) combine information from glosses present in WordNet, with NLMs based on BiLSTMs, through memory networks and co-attention mechanisms, respectively. Vial et al (2018) follows Raganato et al (2017b)'s BiLSTM method, but leverages the semantic network to strategically reduce the set of senses required for disambiguating words.…”
Section: Wsd State-of-the-artmentioning
confidence: 99%
“…Apart from that, tt can easily be seen that the Seq2Seq architectures perform very well against the statistical and knowledge-based methods like IMS + adapted CW [22], Htsa3 [23], UKBgloss w2w [27], Babelfy [28] as well as RST -kernels [24] achieving results that are superior or equivalent to the best models as mentioned above. One interesting evaluation is that the Seq2Seq baseline from [18] is 69.6% and our Seq2Seq baseline performance is 66.3%. When [18] added POS, where it is meant to be learned as one of their tasks, their performance degrades from 69.6% to 68.5%.…”
Section: Resultsmentioning
confidence: 99%
“…One interesting evaluation is that the Seq2Seq baseline from [18] is 69.6% and our Seq2Seq baseline performance is 66.3%. When [18] added POS, where it is meant to be learned as one of their tasks, their performance degrades from 69.6% to 68.5%. When we added POS as a feature, our performance jumped from 66.3% to 67.5% which clearly shows that adding POS information as a feature has an influence on identifying polysemy of a word.…”
Section: Resultsmentioning
confidence: 99%