Multilingual Language Models for Named Entity Recognition in German and English

Baumann, Antonia

doi:10.26615/issn.2603-2821.2019_004

Cited by 11 publications

(9 citation statements)

References 16 publications

(26 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We conduct experiments with four different multilingual BERT models: multilingual cased BERT-base (mBERT), multilingual cased DistilBERT (DistilmBERT), cased XLM-100 and cross-lingual RoBERTa (XLM-RoBERTa). All these models are available via Hugging Face transformers library 2 . Each model is available with sequence lengths of 128 and 512 and we experiment with both.…”

Section: Methodsmentioning

confidence: 99%

“…In their work, multilingual BERT models outperformed monolingual baselines for text classification and NER tasks, while for POS-tagging and dependency parsing the multilingual BERT models fell behind the previously proposed methods, most of which were utilizing monolingual contextual ELMo embeddings [13]. Baumann [2] evaluated multilingual BERT models on German NER task and found that while the multilingual BERT models outperformed two non-contextual LSTM-CRF-based baselines, it performed worse than a model utilizing monolingual contextual character-based string embeddings [1]. Kuratov et al [9] applied multilingual BERT models on several tasks in Russian.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Evaluating Multilingual BERT for Estonian

Kittask¹,

Milintsevich²,

Sirts³

2020

Preprint

View full text Add to dashboard Cite

Recently, large pre-trained language models, such as BERT, have reached state-of-the-art performance in many natural language processing tasks, but for many languages, including Estonian, BERT models are not yet available. However, there exist several multilingual BERT models that can handle multiple languages simultaneously and that have been trained also on Estonian data. In this paper, we evaluate four multilingual models-multilingual BERT, multilingual distilled BERT, XLM and XLM-RoBERTa-on several NLP tasks including POS and morphological tagging, NER and text classification. Our aim is to establish a comparison between these multilingual BERT models and the existing baseline neural models for these tasks. Our results show that multilingual BERT models can generalise well on different Estonian NLP tasks outperforming all baselines models for POS and morphological tagging and text classification, and reaching the comparable level with the best baseline for NER, with XLM-RoBERTa achieving the highest results compared with other multilingual models.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Evaluating Multilingual BERT for Estonian

Kittask¹,

Milintsevich²,

Sirts³

2020

Preprint

View full text Add to dashboard Cite

show abstract

“…In their work, multilingual BERT models outperformed monolingual baselines for text classification and NER tasks, while for POStagging and dependency parsing, the multilingual BERT models fell behind the previously proposed methods, most of which were utilizing monolingual contextual ELMo embeddings [1]. Baumann [10] evaluated multilingual BERT models on German NER task and found that while the multilingual BERT models outperformed two noncontextual LSTM-CRF-based baselines, it performed worse than a model utilizing monolingual contextual character-based string embeddings [11]. Kuratov et al [12] applied multilingual BERT models on several tasks in Russian.…”

Section: Related Workmentioning

confidence: 99%

Evaluating Multilingual BERT for Estonian

Kittask¹,

Milintsevich²,

Sirts³

2020

Frontiers in Artificial Intelligence and Applications

View full text Add to dashboard Cite

Recently, large pre-trained language models, such as BERT, have reached state-of-the-art performance in many natural language processing tasks, but for many languages, including Estonian, BERT models are not yet available. However, there exist several multilingual BERT models that can handle multiple languages simultaneously and that have been trained also on Estonian data. In this paper, we evaluate four multilingual models—multilingual BERT, multilingual distilled BERT, XLM and XLM-RoBERTa—on several NLP tasks including POS and morphological tagging, NER and text classification. Our aim is to establish a comparison between these multilingual BERT models and the existing baseline neural models for these tasks. Our results show that multilingual BERT models can generalise well on different Estonian NLP tasks outperforming all baselines models for POS and morphological tagging and text classification, and reaching the comparable level with the best baseline for NER, with XLM-RoBERTa achieving the highest results compared with other multilingual models.

show abstract

“…Existing approaches to cross-lingual NER can be roughly grouped into two main categories: instance-based transfer via machine translation (MT) and label projection (Mayhew et al, 2017;Jain et al, 2019), and model-based transfer with aligned cross-lingual word representations or pretrained multilingual language models (Joty et al, 2017;Baumann, 2019;Conneau et al, 2020;. Recently, Wu et al (2020) unify instance-based and model-based transfer via knowledge distillation.…”

Section: Introductionmentioning

confidence: 99%

MulDA: A Multilingual Data Augmentation Framework for Low-Resource Cross-Lingual NER

Liu¹,

Ding²,

Joty³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

Named Entity Recognition (NER) for lowresource languages is a both practical and challenging research problem. This paper addresses zero-shot transfer for cross-lingual NER, especially when the amount of sourcelanguage training data is also limited. The paper first proposes a simple but effective labeled sequence translation method to translate source-language training data to target languages and avoids problems such as word order change and entity span determination. With the source-language data as well as the translated data, a generation-based multilingual data augmentation method is introduced to further increase diversity by generating synthetic labeled data in multiple languages. These augmented data enable the language model based NER models to generalize better with both the language-specific features from the target-language synthetic data and the language-independent features from multilingual synthetic data. An extensive set of experiments were conducted to demonstrate encouraging cross-lingual transfer performance of the new research on a wide variety of target languages. 1 * Equal contribution, order decided by coin flip. Linlin Liu and Bosheng Ding are under the Joint PhD Program between Alibaba and Nanyang Technological University.

show abstract

Multilingual Language Models for Named Entity Recognition in German and English

Cited by 11 publications

References 16 publications

Evaluating Multilingual BERT for Estonian

Evaluating Multilingual BERT for Estonian

Evaluating Multilingual BERT for Estonian

MulDA: A Multilingual Data Augmentation Framework for Low-Resource Cross-Lingual NER

Contact Info

Product

Resources

About