Proceedings of the 5th Workshop on BioNLP Open Shared Tasks 2019
DOI: 10.18653/v1/d19-5709
|View full text |Cite
|
Sign up to set email alerts
|

Biomedical Named Entity Recognition with Multilingual BERT

Abstract: We present the approach of the Turku NLP group to the PharmaCoNER task on Spanish biomedical named entity recognition. We apply a CRF-based baseline approach and multilingual BERT to the task, achieving an Fscore of 88% on the development data and 87% on the test set with BERT. Our approach reflects a straightforward application of a state-of-the-art multilingual model that is not specifically tailored to either the language nor the application domain. The source code

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
31
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
3

Relationship

0
10

Authors

Journals

citations
Cited by 63 publications
(31 citation statements)
references
References 16 publications
0
31
0
Order By: Relevance
“…For example, multilingual BERT (Devlin et al, 2019) was trained on Wikipedia articles from more than 100 languages. Although performance improvements show the possibility to use multilingual BERT in monolingual (Hakala and Pyysalo, 2019), multilingual (Tsai et al, 2019) and cross-lingual settings (Wu and Dredze, 2019), it has been questioned whether multilingual BERT is truly multilingual (Pires et al, 2019;Singh et al, 2019;Libovickỳ et al, 2019). Therefore, we will investigate the benefits of aligning its embeddings in our experiments.…”
Section: Related Workmentioning
confidence: 99%
“…For example, multilingual BERT (Devlin et al, 2019) was trained on Wikipedia articles from more than 100 languages. Although performance improvements show the possibility to use multilingual BERT in monolingual (Hakala and Pyysalo, 2019), multilingual (Tsai et al, 2019) and cross-lingual settings (Wu and Dredze, 2019), it has been questioned whether multilingual BERT is truly multilingual (Pires et al, 2019;Singh et al, 2019;Libovickỳ et al, 2019). Therefore, we will investigate the benefits of aligning its embeddings in our experiments.…”
Section: Related Workmentioning
confidence: 99%
“…BERT is a multi-layer transformer trained on the English Wikipedia and BookCorpus (Devlin et al, 2018). While it is trained to predict whether a sentence follows another and randomly blacked out words, the resulting language model can be finetuned for different tasks, such as NER (Hakala and Pyysalo, 2019) and NEN, or adapted for different domains through further training. BioBERT is the result of training BERT on PubMed articles, making it useful for biomedical applications (Lee et al, 2020;Sun and Yang, 2019).…”
Section: Biobertmentioning
confidence: 99%
“…Current state-of-the-art NER systems are mainly based on annotated data and machine learning approaches. The lexicons introduced in some of these systems are mainly for extracting some external features (Liu et al, 2015;Agerri and Rigau, 2016;Chiu and Nichols, 2016 (Huang et al, 2015), the convolutional neural network (CNN) plus a CRF layer, the combination of LSTM and CNN (Chiu and Nichols, 2016), and the BERT based LSTM+CRF model (Jiang et al, 2019;Hakala and Pyysalo, 2019).…”
Section: Related Workmentioning
confidence: 99%