XLM-K: Improving Cross-Lingual Language Model Pre-training with Multilingual Knowledge

Jiang, Xiaoze; Liang, Yilong; Chen, Weizhu; Duan, Nan

doi:10.1609/aaai.v36i10.21330

Cited by 14 publications

(16 citation statements)

References 43 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our experiments have demonstrated that entity supervision in EASE improves the quality of sentence embeddings both in the monolingual setting and, in particular, the multilingual setting. As recent studies have shown, entity annotations can be used as anchors to learn quality cross-lingual representations (Calixto et al, 2021;Nishikawa et al, 2021;Jian et al, 2022;Ri et al, 2022), and our work is another demonstration of their utility, particularly in sentence embeddings. One promising future direction is exploring how to better exploit the cross-lingual nature of entities.…”

Section: Discussionmentioning

confidence: 60%

See 1 more Smart Citation

EASE: Entity-Aware Contrastive Learning of Sentence Embedding

Nishikawa¹,

Ri²,

Yamada³

et al. 2022

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

We present EASE, a novel method for learning sentence embeddings via contrastive learning between sentences and their related entities. The advantage of using entity supervision is twofold: (1) entities have been shown to be a strong indicator of text semantics and thus should provide rich training signals for sentence embeddings; (2) entities are defined independently of languages and thus offer useful cross-lingual alignment supervision. We evaluate EASE against other unsupervised models both in monolingual and multilingual settings. We show that EASE exhibits competitive or better performance in English semantic textual similarity (STS) and short text clustering (STC) tasks and it significantly outperforms baseline methods in multilingual settings on a variety of tasks. Our source code, pretrained models, and newly constructed multilingual STC dataset are available at https: //github.com/studio-ousia/ease.

show abstract

Section: Discussionmentioning

confidence: 60%

“…thus offer a useful cross-lingual alignment supervision (Calixto et al, 2021;Nishikawa et al, 2021;Jian et al, 2022;Ri et al, 2022). The extensive multilingual support of Wikipedia alleviates the need for a parallel resource to train well-aligned multilingual sentence embeddings, especially for low-resource languages.…”

Section: Introductionmentioning

confidence: 99%

EASE: Entity-Aware Contrastive Learning of Sentence Embedding

Nishikawa¹,

Ri²,

Yamada³

et al. 2022

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

show abstract

“…Unicoder (Huang et al, 2019) presents several pre-training tasks upon parallel corpora and ERNIE-M (Ouyang et al, 2021) learns semantic alignment by leveraging back translation. XLM-K (Jiang et al, 2022) leverages the multi-lingual knowledge base to improve cross-lingual performance on knowledge-related tasks. InfoXLM (Chi et al, 2021) and HiCTL (Wei et al, 2020) encourage bilingual alignment via InfoNCE based contrastive loss.…”

Section: Related Workmentioning

confidence: 99%

Modeling Sequential Sentence Relation to Improve Cross-lingual Dense Retrieval

Zhang¹,

Liang²,

Gong³

et al. 2023

Preprint

View full text Add to dashboard Cite

Recently multi-lingual pre-trained language models (PLM) such as mBERT and XLM-R have achieved impressive strides in cross-lingual dense retrieval. Despite its successes, they are general-purpose PLM while the multilingual PLM tailored for cross-lingual retrieval is still unexplored. Motivated by an observation that the sentences in parallel documents are approximately in the same order, which is universal across languages, we propose to model this sequential sentence relation to facilitate cross-lingual representation learning. Specifically, we propose a multilingual PLM called masked sentence model (MSM), which consists of a sentence encoder to generate the sentence representations, and a document encoder applied to a sequence of sentence vectors from a document. The document encoder is shared for all languages to model the universal sequential sentence relation across languages. To train the model, we propose a masked sentence prediction task, which masks and predicts the sentence vector via a hierarchical contrastive loss with sampled negatives. Comprehensive experiments on four cross-lingual retrieval tasks show MSM significantly outperforms existing advanced pre-training models, demonstrating the effectiveness and stronger cross-lingual retrieval capabilities of our approach. Code and model will be available.

show abstract

“…Sequence Tagging Model We implement a BiLSTM-CRF model [200] with the Flair framework [292] to evaluate our data augmentation method on NER and POS tasks. 7 We use a single-layer BiLSTM with hidden state size 512. Dropout layers are applied before and after the BiLSTM layer with dropout rate 0.5.…”

Section: Basic Modelsmentioning

confidence: 99%

“…Following this, a few recent attempts have been made to enhance multilingual PLMs with Wikipedia or KG triples [7,163,164]. However, due to the structural difference between KG and texts, existing KG based pretraining often relies on extra relation/entity embeddings or additional KG encoders for knowledge enhancement.…”

Section: Chapter Backgroundmentioning

confidence: 99%

Towards robust natural language and image processing In low-resource scenarios

Liu¹

View full text Add to dashboard Cite

Comparison of the representations obtained at each layer before (Base) and after adapter-based tuning or fine-tuning on BERT-base using Representational Similarity Analysis (RSA). . . . . . . . . 4.3 Test performance w.r.t the number of training examples. Reported results are averages across five runs with different random seeds. . 4.4 Box plots of test performance distribution over 20 runs across different learning rates.

show abstract

XLM-K: Improving Cross-Lingual Language Model Pre-training with Multilingual Knowledge

Cited by 14 publications

References 43 publications

EASE: Entity-Aware Contrastive Learning of Sentence Embedding

EASE: Entity-Aware Contrastive Learning of Sentence Embedding

Modeling Sequential Sentence Relation to Improve Cross-lingual Dense Retrieval

Towards robust natural language and image processing In low-resource scenarios

Contact Info

Product

Resources

About