Contextualised word embeddings is a powerful tool to detect contextual synonyms. However, most of the current state-of-the-art (SOTA) deep learning concept extraction methods remain supervised and underexploit the potential of the context. In this paper, we propose a self-supervised pre-training approach which is able to detect contextual synonyms of concepts being training on the data created by shallow matching. We apply our methodology in the sparse multi-class setting (over 15,000 concepts) to extract phenotype information from electronic health records. We further investigate data augmentation techniques to address the problem of the class sparsity. Our approach achieves a new SOTA for the unsupervised phenotype concept annotation on clinical text on F1 and Recall outperforming the previous SOTA with a gain of up to 4.5 and 4.0 absolute points, respectively. After fine-tuning with as little as 20% of the labelled data, we also outperform BioBERT and ClinicalBERT. The extrinsic evaluation on three ICU benchmarks also shows the benefit of using the phenotypes annotated by our model as features.
A reduced electrodermal activity (EDA) may be related to autonomic neuropathy (AN). The aims of this study were to independently study the characteristics of the EDA and its correlation with diabetes and AN. During a self-designed test, mean skin conductance level (M-SCL), mean skin conductance response (M-SCR) to stimuli, and difference in M-SCL between feet (DBF) were obtained through a model-based decomposition based on Bayesian statistics and mathematical convex optimization. A group of 22 subjects were included for the final test. Diabetic patients were stratified based on their clinical history and care habits, dividing them into those out of risk and those at risk of developing AN. Statistical difference was found for the latter regarding M-SCR (p < 0,01) and DBF (p < 0,05) with respect to the control group. While past research failed to address potential sources of interference with the EDA measurement, namely emotional state, degree of concentration on the task, and body posture, this study proposes a well-defined protocol to stimulate subjects and acquire proper and reliable EDA data. RESUMENUna actividad electrodérmica (EDA) reducida puede indicar la presencia de una neuropatıa autonómica (AN) subyacente. El objetivo de este estudio fue investigar las caracterısticas de la EDA y su correlación con la diabetes y la AN. A través de un test desarrollado durante la investigación, el nivel promedio de conductancia de la piel (M-SCL), la respuesta promedio de la conductancia (M-SCR), y la diferencia entre los valores promedio M-SCL de ambos pies fueron calculados utilizando una descomposición paramétrica de la EDA basada en la estadıstica bayesiana y en la optimización matemática convexa. La prueba final incluyó 22 sujetos. Los participantes diabéticos fueron estratificados según su historial clınico y hábitos de cuidado, para obtener un grupo fuera de riesgo y otro en riesgo de desarrollar la AN. Se halló una diferencia estadıstica en las métricas de M-SCR (p < 0,01) y DBF (p < 0,05) en aquellos pacientes con respecto al grupo de control. Mientras las investigaciones pasadas no incluyeron factores que pueden interferir potencialmente con la medición de la EDA, tales como el estado emocional, el grado de concentración en la tarea, y la postura corporal, el presente estudio define un protocolo para la estimulación de sujetos durante la adquisición de la EDA en una manera confiable.
Welcome to EVALITA 2020! EVALITA is the evaluation campaign of Natural Language Processing and Speech Tools for Italian. EVALITA is an initiative of the Italian Association for Computational Linguistics (AILC, http://www.ai-lc.it) and it is endorsed by the Italian Association for Artificial Intelligence (AIxIA, http://www.aixia.it) and the Italian Association for Speech Sciences (AISV, http://www.aisv.it).This volume includes the reports of both task organisers and participants to all of the EVALITA 2020 challenges. In the 2020 edition, we coordinated the organization of 14 different tasks belonging to five research areas, being: (i) Affect, Hate, and Stance, (ii) Creativity and Style, (iii) New Challenges in Long-standing Tasks, (iv) Semantics and Multimodality, Time and Diachrony.The volume is opened by an overview to the EVALITA 2020 campaign, in which we describe the tasks, provide statistics on the participants and task organizers as well as our supporting sponsors. The abstract of the keynote speech made by Preslav Nakov titled "Flattening the Curve of the COVID-19 Infodemic: These Evaluation Campaigns Can Help!" is also included in this collection.Due to the 2020 COVID-19 pandemic, the traditional workshop was held online, where several members of the Italian NLP Community presented the results of their research. Despite the circumstances, the workshop represented an occasion for all participants from both academic institutions and private companies to disseminate their work and results and to share ideas through online sessions dedicated to each task and a general discussion during the plenary event.We carried on with the tradition of the "Best system across tasks" award. As in 2018, it represented an incentive for students, IT developers and researchers to push the boundaries of the state of the art by facing tasks in new ways, even if not winning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.