Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.523
|View full text |Cite
|
Sign up to set email alerts
|

LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention

Abstract: Entity representations are useful in natural language tasks involving entities. In this paper, we propose new pretrained contextualized representations of words and entities based on the bidirectional transformer (Vaswani et al., 2017). The proposed model treats words and entities in a given text as independent tokens, and outputs contextualized representations of them. Our model is trained using a new pretraining task based on the masked language model of BERT (Devlin et al., 2019). The task involves predicti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

5
252
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 338 publications
(292 citation statements)
references
References 28 publications
5
252
0
Order By: Relevance
“…There are also limitations in the integration of gazetteer features. Existing studies often add extra features to a word-level model's Contextual Word Representations (CWRs), which typically contain no info about real world entities or their spans (Yamada et al, 2020). This concatenation approach is sub-optimal as it creates additional, and often highly correlated features.…”
Section: Complex Entitiesmentioning
confidence: 99%
See 1 more Smart Citation
“…There are also limitations in the integration of gazetteer features. Existing studies often add extra features to a word-level model's Contextual Word Representations (CWRs), which typically contain no info about real world entities or their spans (Yamada et al, 2020). This concatenation approach is sub-optimal as it creates additional, and often highly correlated features.…”
Section: Complex Entitiesmentioning
confidence: 99%
“…While a subtagger may learn regularities in entity names, a key limitation is that it needs retraining and evaluation on gazetteer updates. Recent work has considered directly integrating knowledge into transformers, e.g., KnowBert adds knowledge to BERT layers (Peters et al, 2019), and LUKE is pretrained to predict masked entities (Yamada et al, 2020). The drawbacks of such methods are that they are specific to Transformers, and the model's knowledge cannot be updated without retraining.…”
Section: Related Workmentioning
confidence: 99%
“…In NER, we also conduct a comparison on the revised version of German datasets in the CoNLL 2006 shared task (Buchholz and Marsi, 2006). Recent work such as Yu et al (2020) and Yamada et al (2020) utilizes document contexts in the datasets. We follow their work and extract document embeddings for the transformer-based embeddings.…”
Section: Comparison With State-of-the-art Approachesmentioning
confidence: 99%
“…NEL typically involves two tasks: recognizing named entities in a given text and then disamgibuating the entity mentions according to the knowledge base (KB). Researchers have shown great success in NER with the help of Convolutional Neural Networks (CNNs), Bidirectional Recurrent Neural Networks (Bi-RNNs), and attention mechanisms along with a CRF decoder (Chiu and Nichols, 2016;Akbik et al, 2018;Ghaddar and Langlais, 2018;Jiang et al, 2019;Baevski et al, 2019;Yamada et al, 2020). Deep neural networks (DNNs) are also dominant in entity resolution tasks.…”
Section: Neural Entity Linkingmentioning
confidence: 99%