Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction

Yarmohammadi, Mahsa; Wu, Shijie; Marone, Marc; Xu, Haoran; Ebner, Seth; Qin, Guanghui; Chen, Yunmo; Guo, Jialiang; Harman, Craig; Murray, Kenton; White, Aaron Steven; Dredze, Mark; Durme, Benjamin Van

doi:10.18653/v1/2021.emnlp-main.149

Cited by 11 publications

(9 citation statements)

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Table 6 shows that multilingual models trained using all three languages achieve slightly to moderately worse performance on each test set compared to their monolingual counterparts. This is unexpected, as prior work (Fei et al, 2020;Daza and Frank, 2020;Yarmohammadi et al, 2021) finds benefits to using silver data. The poorer performance of the multilingual model could be due to the same set of hyperparameters used for all three languages.…”

Section: Multilingual Modelsmentioning

confidence: 85%

“…Data Projection Using annotations in English to create data in a target language has been useful for tasks such as semantic role labeling (Akbik et al, 2015;Aminian et al, 2019), information extraction (Riloff et al, 2002), POS tagging (Yarowsky and Ngai, 2001), and dependency parsing (Ozaki et al, 2021). Previous works find improvements when training on a mixture of gold source language data and projected silver target language data in crosslingual tasks such as semantic role labeling (Fei et al, 2020;Daza and Frank, 2020) and information extraction (Yarmohammadi et al, 2021). The intuition of using both gold and projected silver data is to allow the model to see high-quality gold data as well as data with target language statistics.…”

Section: Multilingualitymentioning

confidence: 99%

See 1 more Smart Citation

Multilingual Coreference Resolution in Multiparty Dialogue

Zheng¹,

Xia²,

Yarmohammadi³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Existing multiparty dialogue datasets for coreference resolution are nascent, and many challenges are still unaddressed. We create a largescale dataset, Multilingual Multiparty Coref (MMC), for this task based on TV transcripts. Due to the availability of gold-quality subtitles in multiple languages, we propose reusing the annotations to create silver coreference data in other languages (Chinese and Farsi) via annotation projection. On the gold (English) data, off-the-shelf models perform relatively poorly on MMC, suggesting that MMC has broader coverage of multiparty coreference than prior datasets. On the silver data, we find success both using it for data augmentation and training from scratch, which effectively simulates the zero-shot cross-lingual setting.

show abstract

Section: Multilingual Modelsmentioning

confidence: 85%

Section: Multilingualitymentioning

confidence: 99%

Multilingual Coreference Resolution in Multiparty Dialogue

Zheng¹,

Xia²,

Yarmohammadi³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…GATE (Ahmad et al, 2021) follows (Subburathinam et al, 2019) and uses a graph convolutional architecture and pretrained knowledge from language models to further improve the performance. Yarmohammadi et al (2021) first translate the whole sentence and then uses token aligners to get a sub-sentential alignment, which has shown to be beneficial. We use a different translation strategy, and our proposed adversarial training approach may also be helpful with their translations.…”

Section: Cross-lingualmentioning

confidence: 99%

“…Cross-lingual learning has been proposed to leverage resources in data-rich languages to train NLP models for data-scarce languages (Ruder et al, 2019). There are two main strategies for building cross-lingual models: (1) train models with multilingual language models and languageuniversal features that are transferable to the target language (Huang et al, 2019;Hsu et al, 2019;Hu et al, 2020a;Luo et al, 2020;Ouyang et al, 2021;Subburathinam et al, 2019;M'hamdi et al, 2019;Ahmad et al, 2021); (2) use machine translation models in a pipeline, either by transforming annotated training data into the desired target language to build target-language models, or by translating data at inference time into the source language and applying source-language models (Cui et al, 2019;Hu et al, 2020a;Yarmohammadi et al, 2021). The first approach relies on the quality of the constructed multilingual semantic space; the discrepancy between source-language training data and target-language evaluation data may cause overfitting.…”

Section: Introductionmentioning

confidence: 99%

Bridging the Gap between Native Text and Translated Text through Adversarial Learning: A Case Study on Cross-Lingual Event Extraction

Yu,

May,

2023

Findings of the Association for Computational Linguistics: EACL 2023

View full text Add to dashboard Cite

Recent research in cross-lingual learning has found that combining large-scale pretrained multilingual language models with machine translation can yield good performance (Phang et al., 2020; Fang et al., 2021). We explore this idea for cross-lingual event extraction with a new model architecture that jointly encodes a source language input sentence with its translation to the target language during training, and takes a target language sentence with its translation back to the source language as input during evaluation. However, we observe significant representational gap between the native texts and translated texts, both in the source language and the target language. This representational gap undermines the effectiveness of cross-lingual transfer learning for event extraction with machine-translated data. In order to mitigate this problem, we propose an adversarial training framework that encourages the language model to produce more similar representations for the translated text and the native text. To be specific, we train the language model such that its hidden representations are able to fool a jointly trained discriminator that distinguishes translated texts' representations from native texts' representations. We conduct experiments on cross-lingual event extraction across three languages. Results demonstrate that our proposed adversarial training can effectively incorporate machine translation to improve event extraction, while simply adding machine-translated data yields unstable performance due to the representational gap. 1

show abstract

“…model (Zhang et al, 2021;NLLB Team et al, 2022), or even utilizing language-specific pretrained language models (Xu et al, 2021;Yarmohammadi et al, 2021). All these studies indicate the importance of language-specific parameters.…”

Section: Introductionmentioning

confidence: 99%

Language-Aware Multilingual Machine Translation with Self-Supervised Learning

Xu,

Maillard,

Goswami

2023

Findings of the Association for Computational Linguistics: EACL 2023

View full text Add to dashboard Cite

Multilingual machine translation (MMT) benefits from cross-lingual transfer but is a challenging multitask optimization problem. This is partly because there is no clear framework to systematically learn languagespecific parameters. Self-supervised learning (SSL) approaches that leverage large quantities of monolingual data (where parallel data is unavailable) have shown promise by improving translation performance as complementary tasks to the MMT task. However, jointly optimizing SSL and MMT tasks is even more challenging. In this work, we first investigate how to utilize intra-distillation to learn more language-specific parameters and then show the importance of these languagespecific parameters.Next, we propose a novel but simple SSL task, concurrent denoising, that co-trains with the MMT task by concurrently denoising monolingual data on both the encoder and decoder. Finally, we apply intra-distillation to this co-training approach. Combining these two approaches significantly improves MMT performance, outperforming three state-of-the-art SSL methods by a large margin, e.g., 11.3% and 3.7% improvement on an 8-language and a 15-language benchmark compared with MASS, respectively 1 .

show abstract

Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction

Cited by 11 publications

References 45 publications

Multilingual Coreference Resolution in Multiparty Dialogue

Multilingual Coreference Resolution in Multiparty Dialogue

Bridging the Gap between Native Text and Translated Text through Adversarial Learning: A Case Study on Cross-Lingual Event Extraction

Language-Aware Multilingual Machine Translation with Self-Supervised Learning

Contact Info

Product

Resources

About