Zewen Chi scite author profile

In this work, we present an informationtheoretic framework that formulates crosslingual language model pre-training as maximizing mutual information between multilingual-multi-granularity texts.The unified view helps us to better understand the existing methods for learning cross-lingual representations. More importantly, inspired by the framework, we propose a new pretraining task based on contrastive learning. Specifically, we regard a bilingual sentence pair as two views of the same meaning and encourage their encoded representations to be more similar than the negative examples. By leveraging both monolingual and parallel corpora, we jointly train the pretext tasks to improve the cross-lingual transferability of pre-trained models. Experimental results on several benchmarks show that our approach achieves considerably better performance. The code and pre-trained models are available at https://aka.ms/infoxlm.

show abstract

Cross-Lingual Natural Language Generation via Pre-Training

Chi

Wei

et al. 2020

AAAI

109

View full text Add to dashboard Cite

In this work we focus on transferring supervision signals of natural language generation (NLG) tasks between multiple languages. We propose to pretrain the encoder and the decoder of a sequence-to-sequence model under both monolingual and cross-lingual settings. The pre-training objective encourages the model to represent different languages in the shared space, so that we can conduct zero-shot cross-lingual transfer. After the pre-training procedure, we use monolingual data to fine-tune the pre-trained model on downstream NLG tasks. Then the sequence-to-sequence model trained in a single language can be directly evaluated beyond that language (i.e., accepting multi-lingual input and producing multi-lingual output). Experimental results on question generation and abstractive summarization show that our model outperforms the machine-translation-based pipeline methods for zero-shot cross-lingual generation. Moreover, cross-lingual transfer improves NLG performance of low-resource languages by leveraging rich-resource language data. Our implementation and data are available at https://github.com/CZWin32768/xnlg.

show abstract

InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

Chi¹,

Dong²,

Wei³

et al. 2020

Preprint

View full text Add to dashboard Cite

In this work, we formulate cross-lingual language model pre-training as maximizing mutual information between multilingual-multigranularity texts. The unified view helps us to better understand the existing methods for learning cross-lingual representations. More importantly, the information-theoretic framework inspires us to propose a pre-training task based on contrastive learning. Given a bilingual sentence pair, we regard them as two views of the same meaning, and encourage their encoded representations to be more similar than the negative examples. By leveraging both monolingual and parallel corpora, we jointly train the pretext tasks to improve the cross-lingual transferability of pre-trained models.Experimental results on several benchmarks show that our approach achieves considerably better performance. The code and pre-trained models are available at http: //aka.ms/infoxlm.

show abstract

Cross-Lingual Natural Language Generation via Pre-Training

Chi

Dong

Wei

et al. 2019

Preprint

View full text Add to dashboard Cite

In this work we focus on transferring supervision signals of natural language generation (NLG) tasks between multiple languages. We propose to pretrain the encoder and the decoder of a sequence-to-sequence model under both monolingual and cross-lingual settings. The pre-training objective encourages the model to represent different languages in the shared space, so that we can conduct zero-shot cross-lingual transfer. After the pre-training procedure, we use monolingual data to fine-tune the pre-trained model on downstream NLG tasks. Then the sequence-to-sequence model trained in a single language can be directly evaluated beyond that language (i.e., accepting multi-lingual input and producing multi-lingual output). Experimental results on question generation and abstractive summarization show that our model outperforms the machine-translation-based pipeline methods for zero-shot cross-lingual generation. Moreover, crosslingual transfer improves NLG performance of low-resource languages by leveraging rich-resource language data. Our implementation and data are available at https://github.com/ CZWin32768/xnlg.

show abstract

Food recommendation with graph convolutional network

Gao

Feng

Huang

et al. 2022

Information Sciences

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zewen Chi

InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

Cross-Lingual Natural Language Generation via Pre-Training

InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

Cross-Lingual Natural Language Generation via Pre-Training

Food recommendation with graph convolutional network

Contact Info

Product

Resources

About