Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence 2020
DOI: 10.24963/ijcai.2020/543
|View full text |Cite
|
Sign up to set email alerts
|

UniTrans : Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data

Abstract: Prior work in cross-lingual named entity recognition (NER) with no/little labeled data falls into two primary categories: model transfer- and data transfer-based methods. In this paper, we find that both method types can complement each other, in the sense that, the former can exploit context information via language-independent features but sees no task-specific information in the target language; while the latter generally generates pseudo target-language training data via translation but its exploit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
30
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 24 publications
(30 citation statements)
references
References 0 publications
0
30
0
Order By: Relevance
“…All datasets are labeled with 4 entity types: PER, ORG, LOC, MISC. Each of them is split into training, validation and test sets following (Wu et al, 2020b). We use three MRC datasets in target languages: MLQA (es) (Lewis et al, 2019), XQuAD (de) (Artetxe et al, 2019), and SQuAD (en) (Rajpurkar et al, 2016).…”
Section: Data Preparationmentioning
confidence: 99%
See 2 more Smart Citations
“…All datasets are labeled with 4 entity types: PER, ORG, LOC, MISC. Each of them is split into training, validation and test sets following (Wu et al, 2020b). We use three MRC datasets in target languages: MLQA (es) (Lewis et al, 2019), XQuAD (de) (Artetxe et al, 2019), and SQuAD (en) (Rajpurkar et al, 2016).…”
Section: Data Preparationmentioning
confidence: 99%
“…UniTrans. Wu et al (2020b) unify data transfer and model transfer for cross-lingual NER. mCell LSTM.…”
Section: Systemsmentioning
confidence: 99%
See 1 more Smart Citation
“…Following Xie et al (2018) and Wu et al (2020), we apply techniques from Lample et al (2017) to translate our primary language training data wordby-word into our secondary languages, and directly copy the entity label of each primary language word to its corresponding translated word. Using embeddings from Bojanowski et al (2017), we learn a mapping, using the MUSE library, from the primary to the secondary language making use of identical character strings between the two languages.…”
Section: Experimental Approachmentioning
confidence: 99%
“…Existing approaches to cross-lingual NER can be roughly grouped into two main categories: instance-based transfer via machine translation (MT) and label projection (Mayhew et al, 2017;Jain et al, 2019), and model-based transfer with aligned cross-lingual word representations or pretrained multilingual language models (Joty et al, 2017;Baumann, 2019;Conneau et al, 2020;. Recently, Wu et al (2020) unify instance-based and model-based transfer via knowledge distillation.…”
Section: Introductionmentioning
confidence: 99%