Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021
DOI: 10.18653/v1/2021.emnlp-main.149
|View full text |Cite
|
Sign up to set email alerts
|

Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction

Abstract: Zero-shot cross-lingual information extraction (IE) describes the construction of an IE model for some target language, given existing annotations exclusively in some other language, typically English. While the advance of pretrained multilingual encoders suggests an easy optimism of "train on English, run on any language", we find through a thorough exploration and extension of techniques that a combination of approaches, both new and old, leads to better performance than any one crosslingual strategy in part… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 11 publications
(9 citation statements)
references
References 45 publications
0
5
0
Order By: Relevance
“…Table 6 shows that multilingual models trained using all three languages achieve slightly to moderately worse performance on each test set compared to their monolingual counterparts. This is unexpected, as prior work (Fei et al, 2020;Daza and Frank, 2020;Yarmohammadi et al, 2021) finds benefits to using silver data. The poorer performance of the multilingual model could be due to the same set of hyperparameters used for all three languages.…”
Section: Multilingual Modelsmentioning
confidence: 85%
See 1 more Smart Citation
“…Table 6 shows that multilingual models trained using all three languages achieve slightly to moderately worse performance on each test set compared to their monolingual counterparts. This is unexpected, as prior work (Fei et al, 2020;Daza and Frank, 2020;Yarmohammadi et al, 2021) finds benefits to using silver data. The poorer performance of the multilingual model could be due to the same set of hyperparameters used for all three languages.…”
Section: Multilingual Modelsmentioning
confidence: 85%
“…Data Projection Using annotations in English to create data in a target language has been useful for tasks such as semantic role labeling (Akbik et al, 2015;Aminian et al, 2019), information extraction (Riloff et al, 2002), POS tagging (Yarowsky and Ngai, 2001), and dependency parsing (Ozaki et al, 2021). Previous works find improvements when training on a mixture of gold source language data and projected silver target language data in crosslingual tasks such as semantic role labeling (Fei et al, 2020;Daza and Frank, 2020) and information extraction (Yarmohammadi et al, 2021). The intuition of using both gold and projected silver data is to allow the model to see high-quality gold data as well as data with target language statistics.…”
Section: Multilingualitymentioning
confidence: 99%
“…GATE (Ahmad et al, 2021) follows (Subburathinam et al, 2019) and uses a graph convolutional architecture and pretrained knowledge from language models to further improve the performance. Yarmohammadi et al (2021) first translate the whole sentence and then uses token aligners to get a sub-sentential alignment, which has shown to be beneficial. We use a different translation strategy, and our proposed adversarial training approach may also be helpful with their translations.…”
Section: Cross-lingualmentioning
confidence: 99%
“…Cross-lingual learning has been proposed to leverage resources in data-rich languages to train NLP models for data-scarce languages (Ruder et al, 2019). There are two main strategies for building cross-lingual models: (1) train models with multilingual language models and languageuniversal features that are transferable to the target language (Huang et al, 2019;Hsu et al, 2019;Hu et al, 2020a;Luo et al, 2020;Ouyang et al, 2021;Subburathinam et al, 2019;M'hamdi et al, 2019;Ahmad et al, 2021); (2) use machine translation models in a pipeline, either by transforming annotated training data into the desired target language to build target-language models, or by translating data at inference time into the source language and applying source-language models (Cui et al, 2019;Hu et al, 2020a;Yarmohammadi et al, 2021). The first approach relies on the quality of the constructed multilingual semantic space; the discrepancy between source-language training data and target-language evaluation data may cause overfitting.…”
Section: Introductionmentioning
confidence: 99%
“…model (Zhang et al, 2021;NLLB Team et al, 2022), or even utilizing language-specific pretrained language models (Xu et al, 2021;Yarmohammadi et al, 2021). All these studies indicate the importance of language-specific parameters.…”
Section: Introductionmentioning
confidence: 99%