Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.391
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Cross-Lingual Part-of-Speech Tagging for Truly Low-Resource Scenarios

Abstract: We describe a fully unsupervised cross-lingual transfer approach for part-of-speech (POS) tagging under a truly low resource scenario. We assume access to parallel translations between the target language and one or more source languages for which POS taggers are available. We use the Bible as parallel data in our experiments: small size, out-of-domain and covering many diverse languages. Our approach innovates in three ways: 1) a robust approach of selecting training instances via cross-lingual annotation pro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
32
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 19 publications
(32 citation statements)
references
References 31 publications
0
32
0
Order By: Relevance
“…While the above works deal with generally improving cross-lingual representations, task-specific cross-lingual systems often show strong performance in a zero-shot setting. For POS tagging, in a similar setting to our work, Eskander et al (2020) achieve strong zero-shot results by using unsupervised projection (Yarowsky et al, 2001) with aligned Bibles. Recent work for cross-lingual NER includes Mayhew et al (2017) who use dictionary translations to create target-language training data, as well as Xie et al (2018) who use a bilingual dictionary in addition to self-attention.…”
Section: Introductionmentioning
confidence: 89%
“…While the above works deal with generally improving cross-lingual representations, task-specific cross-lingual systems often show strong performance in a zero-shot setting. For POS tagging, in a similar setting to our work, Eskander et al (2020) achieve strong zero-shot results by using unsupervised projection (Yarowsky et al, 2001) with aligned Bibles. Recent work for cross-lingual NER includes Mayhew et al (2017) who use dictionary translations to create target-language training data, as well as Xie et al (2018) who use a bilingual dictionary in addition to self-attention.…”
Section: Introductionmentioning
confidence: 89%
“…This approach can, therefore, be seen as a form of distant supervision specific for obtaining labeled data for lowresource languages. Cross-lingual projections have been applied in low-resource settings for tasks, such as POS tagging and parsing (Täckström et al, 2013;Wisniewski et al, 2014;Plank and Agić, 2018;Eskander et al, 2020). Sources for parallel text can be the OPUS project (Tiedemann, 2012), Bible corpora (Mayer and Cysouw, 2014;Christodoulopoulos and Steedman, 2015) or the recent JW300 corpus (Agić and Vulić, 2019).…”
Section: Cross-lingual Annotation Projectionsmentioning
confidence: 99%
“…While multi-lingual Transformer-based models, e.g. mBERT (Devlin et al, 2019) and XLM-R (Conneau et al, 2020), are widely applied in cross-lingual and multi-lingual NLP tasks 2 Keung et al, 2019;Eskander et al, 2020), no attempt has been made to extend the findings on the aforementioned mono-lingual research to this context. In this paper, we explore the roles of attention heads in cross-lingual and multi-lingual tasks for two reasons.…”
Section: Introductionmentioning
confidence: 99%