Proceedings of the 4th Clinical Natural Language Processing Workshop 2022
DOI: 10.18653/v1/2022.clinicalnlp-1.9
|View full text |Cite
|
Sign up to set email alerts
|

Clinical Flair: A Pre-Trained Language Model for Spanish Clinical Natural Language Processing

Abstract: Word embeddings have been widely used in Natural Language Processing (NLP) tasks. Although these representations can capture the semantic information of words, they cannot learn the sequence-level semantics. This problem can be handled using contextual word embeddings derived from pre-trained language models, which have contributed to significant improvements in several NLP tasks. Further improvements are achieved when pretraining these models on domain-specific corpora. In this paper, we introduce Clinical Fl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

4
3

Authors

Journals

citations
Cited by 8 publications
(8 citation statements)
references
References 11 publications
0
5
0
Order By: Relevance
“…Nowadays, there is a strong development of contextualized word embeddings that assign dynamic representations to words based on their contexts, achieving state-of-the-art performance in multiple tasks. For the clinical domain in Spanish, relevant works include (Akhtyamova et al, 2020 ; Carrino et al, 2022 ; Rojas et al, 2022 ). These contextualized word embeddings are challenging to compute and deploy in production environments due to their demanding infrastructure needs.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Nowadays, there is a strong development of contextualized word embeddings that assign dynamic representations to words based on their contexts, achieving state-of-the-art performance in multiple tasks. For the clinical domain in Spanish, relevant works include (Akhtyamova et al, 2020 ; Carrino et al, 2022 ; Rojas et al, 2022 ). These contextualized word embeddings are challenging to compute and deploy in production environments due to their demanding infrastructure needs.…”
Section: Discussionmentioning
confidence: 99%
“…These embeddings, however, were not intrinsically evaluated nor compared performance-wise with other embeddings, and they were not made available for use. In another work, Akhtyamova et al ( 2020 ) used the Flair (Akbik et al, 2019 ) and BERT (Devlin et al, 2018 ) models to calculate word embeddings for the Spanish clinical domain as part of a named entity recognition (NER) task and Rojas et al ( 2022 ) computed another Flair language model from clinical narratives in Spanish. These models utilize contextualized word embeddings that take into account the word context upon embedding calculation.…”
Section: Related Workmentioning
confidence: 99%
“… 50 Specifically, they used a Bi-LSTM and CRF layers to recognize each entity type and incorporated pretrained embeddings trained on the Chilean Waiting List corpus 44 , 51 and character-level contextualized embeddings. 52 The code is freely available. 53…”
Section: Methodsmentioning
confidence: 99%
“…Regarding the experimental setup, the disease model was trained to 150 epochs using an SGD optimizer with mini-batches of size 32 and a learning rate of 0.1. As mentioned, to encode sentences, we used two types of representations; a 300-dimensional word embedding model trained on the Chilean Waiting List corpus 4 and characterlevel contextualized embeddings retrieved from the Clinical Flair model (Rojas et al, 2022b). To implement the model and perform our experiments, we used the Flair framework, widely used by the NLP research community (Akbik et al, 2019).…”
Section: Ner Modelmentioning
confidence: 99%