Proceedings of the Student Research Workshop Associated With RANLP 2019 2019
DOI: 10.26615/issn.2603-2821.2019_004
|View full text |Cite
|
Sign up to set email alerts
|

Multilingual Language Models for Named Entity Recognition in German and English

Abstract: We assess the language specificity of recent language models by exploring the potential of a multilingual language model. In particular, we evaluate Google's multilingual BERT (mBERT) model on Named Entity Recognition (NER) in German and English. We expand the work on language model fine-tuning by Howard and Ruder (2018), applying it to the BERT architecture. We successfully reproduce the NER results published by Devlin et al. (2019). Our results show that the multilingual language model generalises well for N… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 11 publications
(9 citation statements)
references
References 16 publications
(26 reference statements)
0
9
0
Order By: Relevance
“…We conduct experiments with four different multilingual BERT models: multilingual cased BERT-base (mBERT), multilingual cased DistilBERT (DistilmBERT), cased XLM-100 and cross-lingual RoBERTa (XLM-RoBERTa). All these models are available via Hugging Face transformers library 2 . Each model is available with sequence lengths of 128 and 512 and we experiment with both.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…We conduct experiments with four different multilingual BERT models: multilingual cased BERT-base (mBERT), multilingual cased DistilBERT (DistilmBERT), cased XLM-100 and cross-lingual RoBERTa (XLM-RoBERTa). All these models are available via Hugging Face transformers library 2 . Each model is available with sequence lengths of 128 and 512 and we experiment with both.…”
Section: Methodsmentioning
confidence: 99%
“…In their work, multilingual BERT models outperformed monolingual baselines for text classification and NER tasks, while for POS-tagging and dependency parsing the multilingual BERT models fell behind the previously proposed methods, most of which were utilizing monolingual contextual ELMo embeddings [13]. Baumann [2] evaluated multilingual BERT models on German NER task and found that while the multilingual BERT models outperformed two non-contextual LSTM-CRF-based baselines, it performed worse than a model utilizing monolingual contextual character-based string embeddings [1]. Kuratov et al [9] applied multilingual BERT models on several tasks in Russian.…”
Section: Related Workmentioning
confidence: 99%
“…In their work, multilingual BERT models outperformed monolingual baselines for text classification and NER tasks, while for POStagging and dependency parsing, the multilingual BERT models fell behind the previously proposed methods, most of which were utilizing monolingual contextual ELMo embeddings [1]. Baumann [10] evaluated multilingual BERT models on German NER task and found that while the multilingual BERT models outperformed two noncontextual LSTM-CRF-based baselines, it performed worse than a model utilizing monolingual contextual character-based string embeddings [11]. Kuratov et al [12] applied multilingual BERT models on several tasks in Russian.…”
Section: Related Workmentioning
confidence: 99%
“…Existing approaches to cross-lingual NER can be roughly grouped into two main categories: instance-based transfer via machine translation (MT) and label projection (Mayhew et al, 2017;Jain et al, 2019), and model-based transfer with aligned cross-lingual word representations or pretrained multilingual language models (Joty et al, 2017;Baumann, 2019;Conneau et al, 2020;. Recently, Wu et al (2020) unify instance-based and model-based transfer via knowledge distillation.…”
Section: Introductionmentioning
confidence: 99%