Proceedings of the 3rd Clinical Natural Language Processing Workshop 2020
DOI: 10.18653/v1/2020.clinicalnlp-1.7
|View full text |Cite
|
Sign up to set email alerts
|

BioBERTpt - A Portuguese Neural Language Model for Clinical Named Entity Recognition

Abstract: With the growing number of electronic health record data, clinical NLP tasks have become increasingly relevant to unlock valuable information from unstructured clinical text. Although the performance of downstream NLP tasks, such as named-entity recognition (NER), in English corpus has recently improved by contextualised language models, less research is available for clinical texts in low resource languages. Our goal is to assess a deep contextual embedding model for Portuguese, so called BioBERTpt, to suppor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
1
2

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
2
1
1

Relationship

4
5

Authors

Journals

citations
Cited by 46 publications
(33 citation statements)
references
References 19 publications
0
18
1
2
Order By: Relevance
“…Alsentzer et al 2019 trained BERT andBioBERT (Lee et al, 2019), on MIMIC notes, and showed that Bio + Clinical BERT performed better than BERT and BioBERT trained on MedNLI dataset and i2b2 2010 datasets. Similarly, Schneider et al (2020) demonstrated that the fine-tuned BERT using Portuguese clinical notes outperformed BERT trained on general corpora.…”
Section: Clinical Named Entity Recognitionmentioning
confidence: 90%
See 1 more Smart Citation
“…Alsentzer et al 2019 trained BERT andBioBERT (Lee et al, 2019), on MIMIC notes, and showed that Bio + Clinical BERT performed better than BERT and BioBERT trained on MedNLI dataset and i2b2 2010 datasets. Similarly, Schneider et al (2020) demonstrated that the fine-tuned BERT using Portuguese clinical notes outperformed BERT trained on general corpora.…”
Section: Clinical Named Entity Recognitionmentioning
confidence: 90%
“…Various NER challenges and shared tasks, such as the i2b2 and n2c2 NLP challenges (Uzuner et al, 2010;Suominen et al, 2013;Kelly et al, 2014;Bethard et al, 2015;Névéol et al, 2015;Henry et al, 2020), fostered the development of NER methods (De Bruijn et al, 2011;Jiang et al, 2011;Kim et al, 2015;Van Mulligen et al, 2016;El Boukkouri et al, 2019) for the clinical domain in different languages (Lopes et al, 2019;Sun and Yang, 2019;Andrioli de Souza et al, 2020;Schneider et al, 2020). The DEFT challenge proposed an information extraction task for the French clinical corpus, with entities distributed across four categories: anatomy, clinical practices, treatments, and time (Cardon et al, 2020).…”
Section: Clinical Named Entity Recognitionmentioning
confidence: 99%
“…Various NER challanges and shared tasks [Uzuner et al, 2010, Kelly et al, 2014, Névéol et al, 2015, Suominen et al, 2013, Bethard et al, 2015] fostered the development of NER methods Van Mulligen et al, 2016 Kim et al, 2015 Jiang et al, 2011, De Bruijn et al, 2011, El Boukkouri et al, 2019] for clinical domain in different languages [Lopes et al, 2019, Schneider et al, 2020, Sun and Yang, 2019]. Performance varies greatly across the different methods and corpora, with more modern methods achieving F 1 -score as high as 95%.…”
Section: Related Workmentioning
confidence: 99%
“…As input features to the classic classifiers, we used the TF-IDF vector representations of the texts to be classified. As state-of-the-art approaches, we fine-tuned BERT-based models such as its multilingual version [Devlin et al 2019]; BERTimbau [Souza et al 2020], its Brazilian Portuguese version; and BioBERTpt [Schneider et al 2020], a Brazilian Portuguese version of BERT focused on the clinical domain.…”
Section: Modelsmentioning
confidence: 99%