Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022
DOI: 10.18653/v1/2022.acl-long.505
|View full text |Cite
|
Sign up to set email alerts
|

mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models

Abstract: Recent studies have shown that multilingual pretrained language models can be effectively improved with cross-lingual alignment information from Wikipedia entities. However, existing methods only exploit entity information in pretraining and do not explicitly use entities in downstream tasks. In this study, we explore the effectiveness of leveraging entity representations for downstream cross-lingual tasks. We train a multilingual language model with 24 languages with entity representations and show the model … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(4 citation statements)
references
References 30 publications
0
4
0
Order By: Relevance
“…This progress has been accompanied by the creation of entity-driven datasets for tasks such as language modeling [1,37,59], question answering [32,34,42,71,87], fact checking [4,55,73] and information extraction [85,89], to name a few. Yet, recent findings [18,24,41,64,70,76] suggest that entity representation and identification (i.e., identifying the correct entity that match a given text) are among the main challenges that should be solved to further increase performance on such datasets. We believe that TempEL can contribute to addressing these challenges by: (i) encouraging research on devising more robust methods to creating entity representations that are invariant to temporal changes; and (ii) improving entity identification for non-trivial scenarios involving ambiguous and uncommon mentions (e.g., linked to overshadowed entities as defined above).…”
Section: Entity-driven Datasetsmentioning
confidence: 99%
“…This progress has been accompanied by the creation of entity-driven datasets for tasks such as language modeling [1,37,59], question answering [32,34,42,71,87], fact checking [4,55,73] and information extraction [85,89], to name a few. Yet, recent findings [18,24,41,64,70,76] suggest that entity representation and identification (i.e., identifying the correct entity that match a given text) are among the main challenges that should be solved to further increase performance on such datasets. We believe that TempEL can contribute to addressing these challenges by: (i) encouraging research on devising more robust methods to creating entity representations that are invariant to temporal changes; and (ii) improving entity identification for non-trivial scenarios involving ambiguous and uncommon mentions (e.g., linked to overshadowed entities as defined above).…”
Section: Entity-driven Datasetsmentioning
confidence: 99%
“…This progress has been accompanied by the creation of entity-driven datasets for tasks such as language modeling [238][239][240], question answering [241][242][243][244][245], fact checking [16,17,246] and information extraction [4,48], to name a few. Yet, recent findings [21,[247][248][249][250][251] identifying the correct entity that match a given text) are among the main challenges that should be solved to further increase performance on such datasets. We believe that TempEL can contribute to addressing these challenges by: (i) encouraging research on devising more robust methods to creating entity representations that are invariant to temporal changes; and (ii) improving entity identification for non-trivial scenarios involving ambiguous and uncommon mentions (e.g., linked to overshadowed entities as defined above).…”
Section: Entity-driven Datasetsmentioning
confidence: 99%
“…Following this, a few recent attempts have been made to enhance multilingual PLMs with Wikipedia or KG triples [7,163,164]. However, due to the structural difference between KG and texts, existing KG based pretraining often relies on extra relation/entity embeddings or additional KG encoders for knowledge enhancement.…”
Section: Chapter Backgroundmentioning
confidence: 99%
“…These extra embeddings/components may add significantly more parameters which in turn increase inference complexity, or cause inconsistency between pre-train and downstream tasks. For example, mLUKE [164] has to enumerate all possible entity spans for NER to minimize the inconsistency caused by entity and entity position embeddings. Other methods [7,154] We evaluate KMLM on a wide range of knowledge-intensive cross-lingual tasks, including NER, factual knowledge retrieval, relation classification, and logical reasoning which is a novel task designed by us to test the reasoning capability of the models.…”
Section: Chapter Backgroundmentioning
confidence: 99%