Zero-shot Word Sense Disambiguation using Sense Definition Embeddings

Kumar, S. Adish; Jat, Sharmistha; Saxena, Karan; Talukdar, Partha

doi:10.18653/v1/p19-1568

Cited by 90 publications

(105 citation statements)

References 36 publications

(54 reference statements)

Supporting

Mentioning

103

Contrasting

Order By: Relevance

“…In addition, there is significant research into strategies for learning neural representations of entities in knowledge bases and coding systems. Past work has investigated diverse approaches, such as leveraging rich semantic information from knowledge base structure and web-scale annotated corpora (34,97,98), utilizing definitions of word senses (similar to our use of ICF definitions) (99,100), and combining terminologies with targeted selection of training corpora to learn applicationtailored concept representations (101,102). While most of the research on entity representations requires resources not yet available for FSI (e.g., large, annotated corpora; well-developed terminologies; robust and interconnected knowledge graph structure), all present significant opportunities to advance FSI coding technologies as more resources are developed.…”

Section: Alternative Coding Approachesmentioning

confidence: 99%

Automated Coding of Under-Studied Medical Concept Domains: Linking Physical Activity Reports to the International Classification of Functioning, Disability, and Health

Newman-Griffis¹,

Fosler‐Lussier²

2021

Front. Digit. Health

View full text Add to dashboard Cite

Linking clinical narratives to standardized vocabularies and coding systems is a key component of unlocking the information in medical text for analysis. However, many domains of medical concepts, such as functional outcomes and social determinants of health, lack well-developed terminologies that can support effective coding of medical text. We present a framework for developing natural language processing (NLP) technologies for automated coding of medical information in under-studied domains, and demonstrate its applicability through a case study on physical mobility function. Mobility function is a component of many health measures, from post-acute care and surgical outcomes to chronic frailty and disability, and is represented as one domain of human activity in the International Classification of Functioning, Disability, and Health (ICF). However, mobility and other types of functional activity remain under-studied in the medical informatics literature, and neither the ICF nor commonly-used medical terminologies capture functional status terminology in practice. We investigated two data-driven paradigms, classification and candidate selection, to link narrative observations of mobility status to standardized ICF codes, using a dataset of clinical narratives from physical therapy encounters. Recent advances in language modeling and word embedding were used as features for established machine learning models and a novel deep learning approach, achieving a macro-averaged F-1 score of 84% on linking mobility activity reports to ICF codes. Both classification and candidate selection approaches present distinct strengths for automated coding in under-studied domains, and we highlight that the combination of (i) a small annotated data set; (ii) expert definitions of codes of interest; and (iii) a representative text corpus is sufficient to produce high-performing automated coding systems. This research has implications for continued development of language technologies to analyze functional status information, and the ongoing growth of NLP tools for a variety of specialized applications in clinical care and research.

show abstract

Section: Alternative Coding Approachesmentioning

confidence: 99%

Automated Coding of Under-Studied Medical Concept Domains: Linking Physical Activity Reports to the International Classification of Functioning, Disability, and Health

Newman-Griffis¹,

Fosler‐Lussier²

2021

Front. Digit. Health

View full text Add to dashboard Cite

show abstract

“…To overcome the aforementioned shortcomings, coarser sense inventories (Lacerra et al 2020) and automatic data augmentation approaches (Pasini and Navigli 2017;Pasini, Elia, and Navigli 2018;Scarlini, Pasini, and Navigli 2019) have been developed to cover more words, senses and languages. At the same time, dedicated architectures have been built to exploit the definitional information of a knowledge base (Luo et al 2018;Kumar et al 2019).…”

Section: Related Workmentioning

confidence: 99%

“…Among knowledge-based approaches, we took into account the extension of Lesk comprising word embeddings (Basile, Caputo, and Semeraro 2014, Lesk ext +emb),the extended version of UKB with gloss relations (Agirre, de Lacalle, and Soroa 2014, UKB gloss ) and Babelfy (Moro, Raganato, and Navigli 2014). As for supervised systems we considered an SVM-based classifier integrated with word embeddings (Iacobacci, Pilehvar, and Navigli 2016, IMS+emb), the Bi-LSTM with attention and multi-task objective presented in Raganato, Delli Bovi, and Navigli, Bi-LSTM (2017), and the more recent supervised systems leveraging sense definitions, i.e., HCAN (Luo et al 2018) and EWISE (Kumar et al 2019). We also performed a comparison with the two LSTM-based architectures of Yuan et al (2016, LSTM-LP) and context2vec (Melamud, Goldberger, and Dagan 2016) for learning representations of the annotated sentences in the training corpus.…”

Section: Wsd Modelmentioning

confidence: 99%

“…Over the years, two distinct lines of research have been developed to tackle this problem: supervised and knowledgebased WSD. On the one hand, supervised models rely on semantically-annotated corpora for training (Raganato, Delli Bovi, and Navigli 2017;Kumar et al 2019;Bevilacqua and Navigli 2019), while, on the other hand, knowledgebased systems employ graph-based algorithms on semantic networks to find the set of meanings that better disambiguate the input words (Moro, Raganato, and Navigli 2014;Agirre, de Lacalle, and Soroa 2014). Even though supervised approaches have proved to achieve better performance, Copyright c 2020, Association for the Advancement of Artificial Intelligence (www.aaai.org).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

SensEmBERT: Context-Enhanced Sense Embeddings for Multilingual Word Sense Disambiguation

Scarlini

Pasini

Navigli

2020

AAAI

View full text Add to dashboard Cite

Contextual representations of words derived by neural language models have proven to effectively encode the subtle distinctions that might occur between different meanings of the same word. However, these representations are not tied to a semantic network, hence they leave the word meanings implicit and thereby neglect the information that can be derived from the knowledge base itself. In this paper, we propose SensEmBERT, a knowledge-based approach that brings together the expressive power of language modelling and the vast amount of knowledge contained in a semantic network to produce high-quality latent semantic representations of word meanings in multiple languages. Our vectors lie in a space comparable with that of contextualized word embeddings, thus allowing a word occurrence to be easily linked to its meaning by applying a simple nearest neighbour approach.We show that, whilst not relying on manual semantic annotations, SensEmBERT is able to either achieve or surpass state-of-the-art results attained by most of the supervised neural approaches on the English Word Sense Disambiguation task. When scaling to other languages, our representations prove to be equally effective as their English counterpart and outperform the existing state of the art on all the Word Sense Disambiguation multilingual datasets. The embeddings are released in five different languages at http://sensembert.org.

show abstract

“…Wang [18] extended the word features into sentence-level features and combined them in a further study. Liu [19] and Kumar [20] focused on the sense of words and studied word embedding. Such reverse thinking provided new insights into disambiguation.…”

Section: State Of the Artmentioning

confidence: 99%

Disambiguation of Biomedical Acronyms Based on a Bidirectional Recurrent Neural Network of Character-level Features

Ren¹,

Na²,

Xiong³

et al. 2019

JESTR

View full text Add to dashboard Cite

Polysemic acronyms are very common in the field of biomedicine. These acronyms have different senses in different contexts. The ambiguity of acronyms may cause significant negative impact on the understanding of the full text by machine learning. To address the disambiguation of acronyms in the biomedical domain, most associated studies are based on methods using word-level contextual features. These methods require abundant relevant external resources for model training, and the accuracy of their disambiguation of acronyms may decrease greatly upon the lack of external resources. In this study, disambiguation of biomedical acronyms was investigated on the basis of the character-level feature model to realize the disambiguation of biomedical acronyms with largely limited external corpora. First, sentences containing ambiguous acronyms were extracted through retrieval and the feature vector of the context were initialized by using the character-level features. Second, these initial vectors were input into the bidirectional long shortterm memory neutral network model for training. Finally, the disambiguation of acronyms was realized by the outputs of the neutral network model through the Softmax classification approach. The results of acronym disambiguation based on character-level feature model were also compared with those based on word-level feature models. Results demonstrate that the average accuracy of the character-level feature neutral network algorithm reaches 85.82% on the dataset of 106 common biomedical acronyms. Thus, the character-level feature neutral network algorithm is superior to the traditional methods, which use a large number of external resources. This study confirms that the disambiguation method based on character-level features is applicable to the disambiguation of biomedical acronyms under limited relevant data.

show abstract

Zero-shot Word Sense Disambiguation using Sense Definition Embeddings

Cited by 90 publications

References 36 publications

Automated Coding of Under-Studied Medical Concept Domains: Linking Physical Activity Reports to the International Classification of Functioning, Disability, and Health

Automated Coding of Under-Studied Medical Concept Domains: Linking Physical Activity Reports to the International Classification of Functioning, Disability, and Health

SensEmBERT: Context-Enhanced Sense Embeddings for Multilingual Word Sense Disambiguation

Disambiguation of Biomedical Acronyms Based on a Bidirectional Recurrent Neural Network of Character-level Features

Contact Info

Product

Resources

About