A Novel Neural Sequence Model with Multiple Attentions for Word Sense Disambiguation

Ahmed, Mahtab; Samee, Muhammad Rifayat; Mercer, Robert E.

doi:10.1109/icmla.2018.00109

Cited by 2 publications

(1 citation statement)

References 22 publications

(33 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Other existing BLSTM-based WSD algorithms are Seq2Seq-inspired models, which typically underperform conventional supervised WSD models. [44][45][46] Zero-shot learning Zero-shot learning (ZSL) aims at predicting labels for instances that belong to classes that were not directly seen during training. 47,48 The underlying secret ensuring the success of ZSL is to find an intermediate semantic representation to transfer the knowledge learned from seen classes to unseen ones.…”

Section: Bidirectional Lstmmentioning

confidence: 99%

deepBioWSD: effective deep neural word sense disambiguation of biomedical text data

Pesaranghader

Matwin

Sokolova

et al. 2019

Journal of the American Medical Informatics Association

View full text Add to dashboard Cite

Objective In biomedicine, there is a wealth of information hidden in unstructured narratives such as research articles and clinical reports. To exploit these data properly, a word sense disambiguation (WSD) algorithm prevents downstream difficulties in the natural language processing applications pipeline. Supervised WSD algorithms largely outperform un- or semisupervised and knowledge-based methods; however, they train 1 separate classifier for each ambiguous term, necessitating a large number of expert-labeled training data, an unattainable goal in medical informatics. To alleviate this need, a single model that shares statistical strength across all instances and scales well with the vocabulary size is desirable. Materials and Methods Built on recent advances in deep learning, our deepBioWSD model leverages 1 single bidirectional long short-term memory network that makes sense prediction for any ambiguous term. In the model, first, the Unified Medical Language System sense embeddings will be computed using their text definitions; and then, after initializing the network with these embeddings, it will be trained on all (available) training data collectively. This method also considers a novel technique for automatic collection of training data from PubMed to (pre)train the network in an unsupervised manner. Results We use the MSH WSD dataset to compare WSD algorithms, with macro and micro accuracies employed as evaluation metrics. deepBioWSD outperforms existing models in biomedical text WSD by achieving the state-of-the-art performance of 96.82% for macro accuracy. Conclusions Apart from the disambiguation improvement and unsupervised training, deepBioWSD depends on considerably less number of expert-labeled data as it learns the target and the context terms jointly. These merit deepBioWSD to be conveniently deployable in real-time biomedical applications.

show abstract