Contextual embeddings represent a new generation of semantic representations learned from Neural Language Modelling (NLM) that addresses the issue of meaning conflation hampering traditional word embeddings. In this work, we show that contextual embeddings can be used to achieve unprecedented gains in Word Sense Disambiguation (WSD) tasks. Our approach focuses on creating sense-level embeddings with full-coverage of WordNet, and without recourse to explicit knowledge of sense distributions or task-specific modelling. As a result, a simple Nearest Neighbors (k-NN) method using our representations is able to consistently surpass the performance of previous systems using powerful neural sequencing models. We also analyse the robustness of our approach when ignoring part-of-speech and lemma features, requiring disambiguation against the full sense inventory, and revealing shortcomings to be improved. Finally, we explore applications of our sense embeddings for concept-level analyses of contextual embeddings and their respective NLMs.
Transformer-based language models have taken many fields in NLP by storm. BERT and its derivatives dominate most of the existing evaluation benchmarks, including those for Word Sense Disambiguation (WSD), thanks to their ability in capturing context-sensitive semantic nuances. However, there is still little knowledge about their capabilities and potential limitations in encoding and recovering word senses. In this article, we provide an in-depth quantitative and qualitative analysis of the celebrated BERT model with respect to lexical ambiguity. One of the main conclusions of our analysis is that BERT can accurately capture high-level sense distinctions, even when a limited number of examples is available for each word sense. Our analysis also reveals that in some cases language models come close to solving coarse-grained noun disambiguation under ideal conditions in terms of availability of training data and computing resources. However, this scenario rarely occurs in real-world settings and, hence, many practical challenges remain even in the coarse-grained setting. We also perform an in-depth comparison of the two main language model based WSD strategies, i.e., fine-tuning and feature extraction, finding that the latter approach is more robust with respect to sense bias and it can better exploit limited available training data. In fact, the simple feature extraction strategy of averaging contextualized embeddings proves robust even using only three training sentences per word sense, with minimal improvements obtained by increasing the size of this training data.
Despite its importance, the time variable has been largely neglected in the NLP and language model literature. In this paper, we present TimeLMs, a set of language models specialized on diachronic Twitter data. We show that a continual learning strategy contributes to enhancing Twitter-based language models' capacity to deal with future and out-of-distribution tweets, while making them competitive with standardized and more monolithic benchmarks. We also perform a number of qualitative analyses showing how they cope with trends and peaks in activity involving specific named entities or concept drift. TimeLMs is available at https://github. com/cardiffnlp/timelms.
Progress in the field of Natural Language Processing (NLP) has been closely followed by applications in the medical domain. Recent advancements in Neural Language Models (NLMs) have transformed the field and are currently motivating numerous works exploring their application in different domains. In this paper, we explore how NLMs can be used for Medical Entity Linking with the recently introduced MedMentions dataset, which presents two major challenges: (1) a large target ontology of over 2M concepts, and (2) low overlap between concepts in train, validation and test sets. We introduce a solution, MedLinker, that addresses these issues by leveraging specialized NLMs with Approximate Dictionary Matching, and show that it performs competitively on semantic type linking, while improving the state-of-the-art on the more fine-grained task of concept linking (+4 F1 on MedMentions main task).
State-of-the-art methods for Word Sense Disambiguation (WSD) combine two different features: the power of pre-trained language models and a propagation method to extend the coverage of such models. This propagation is needed as current sense-annotated corpora lack coverage of many instances in the underlying sense inventory (usually WordNet). At the same time, unambiguous words make for a large portion of all words in WordNet, while being poorly covered in existing senseannotated corpora. In this paper, we propose a simple method to provide annotations for most unambiguous words in a large corpus. We introduce the UWA (Unambiguous Word Annotations) dataset and show how a state-of-theart propagation-based model can use it to extend the coverage and quality of its word sense embeddings by a significant margin, improving on its original results on WSD.
The oxygen fugacities (fOz's) of magnetically-concentrated fractions (MCF) of three rock samples from the Skaergaard Layered Intrusion were measured between 800-1150~ C using oxygen-specific, solid zirconia electrolytes at atmospheric pressure. Two of the bulk rock samples (an oxide cumulate and an oxide-bearing gabbro) are from the Middle Zone (MZ) and the other (an olivine plagioclase orthocumulate) is from the Lower Zone (LZ). All MCF define fO2 versus T arrays that lie 1.5-0.5 log units above the fayalite-magnetite-quartz (FMQ) buffer. Experiments with different cell-imposed initial redox states (one from a reduced direction and one from an oxidized direction) were run on each sample in an attempt to achieve experimental reversibility. This was accomplished by imposing a known redox memory on the galvanic cell prior to loading each sample. Reversibility for each sample agreed to better than 0.2 of a log unit. Irreversible autoreduction of 0.2 of a log unit was observed on the two MZ samples at temperatures exceeding 1065 ~ C. Scanning electron microscope and electron microprobe study of pre-and post-run products shows that reaction and textural re-equilibration occurred among the oxide phase assemblages under the experimental conditions employed. Careful characterization of pre-and post-run assemblages is clearly necessary before adequate interpretation of the experimental results can be made in these types of electrochemical studies. Different approaches to investigations of the fO2 of the Skaergaard Intrusion, be it thermodynamic calculations or experimental methods, should yield concordant results or at least understandable discrepancies. Calculated fO2's using thermobarometry applied to the ihnenite-magnetite pairs in the post-experimental assemblages agree with the experimentally determined fO2's to within one log unit at a given temperature. These results are also consistent with previously calculated fO2 values (Buddington and Lindsley 1964; Morse et al. 1980), but are considerably more oxidized than a previous electrolyte-based fO2 study of a different sample suite from the Skaergaard (Sato and Valenza 1980) that include values close to the iron-wustite (IW) buffer from both MZ and LZ oxide separates. Differences between this electrochemical study and that of Sato and Valenza (1980) may be due to variations in the level of indigenous (or curatorially-introduced) carbon in the samples studied. Despite a number of experimental difficulties, electrochemical cells can provide an accurate and precise Offprint requests to." A.B. Kersting method of determining the oxygen fugacity of naturally occurring, complex oxide assemblages. Tight experimental reversals and reproducible values obtained in heating and cooling cycles are an indication of the precision and accuracy of the data recoverable with electrochemical cells.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.