Findings of the Association for Computational Linguistics: EMNLP 2020 2020
DOI: 10.18653/v1/2020.findings-emnlp.71
|View full text |Cite
|
Sign up to set email alerts
|

E-BERT: Efficient-Yet-Effective Entity Embeddings for BERT

Abstract: We present a novel way of injecting factual knowledge about entities into the pretrained BERT model (Devlin et al., 2019): We align Wikipedia2Vec entity vectors (Yamada et al., 2016) with BERT's native wordpiece vector space and use the aligned entity vectors as if they were wordpiece vectors. The resulting entity-enhanced version of BERT (called E-BERT) is similar in spirit to ERNIE (Zhang et al., 2019) and KnowBert (Peters et al., 2019), but it requires no expensive further pretraining of the BERT encoder… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
99
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 110 publications
(116 citation statements)
references
References 19 publications
1
99
0
Order By: Relevance
“…This argues for proponents of resourcehungry deep learning models to try harder to find cheap "green" baselines or to combine the best of both worlds (cf. Poerner et al, 2020).…”
Section: Modelmentioning
confidence: 99%
“…This argues for proponents of resourcehungry deep learning models to try harder to find cheap "green" baselines or to combine the best of both worlds (cf. Poerner et al, 2020).…”
Section: Modelmentioning
confidence: 99%
“…Factual Knowledge Retrieval from LMs Several works have focused on probing factual knowledge solely from pre-trained LMs without access to external knowledge. They do so by either using prompts and letting the LM fill in the blanks, which assumes that the LM is a static knowledge source (Petroni et al, 2019;Jiang et al, 2020;Poerner et al, 2019;Bouraoui et al, 2020), or fine-tuning the LM on a set of question-answer pairs to directly generate answers, which dynamically adapts the LM to this particular task (Roberts et al, 2020). Impressive results demonstrated by these works indicate that large-scale LMs contain a significant amount of knowledge, in some cases even outperforming competitive question answering systems relying on external resources (Roberts et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…Language models (LMs; (Church, 1988;Kneser and Ney, 1995;Bengio et al, 2003)) learn to model the probability distribution of text, and in doing so capture information about various aspects of the syntax or semantics of the language at hand. Recent works have presented intriguing results demonstrating that modern large-scale LMs also capture a significant amount of factual knowledge (Petroni et al, 2019;Jiang et al, 2020;Poerner et al, 2019). This knowledge is generally probed by having the LM fill in the blanks of cloze-style prompts such as en fr nl ru es jp vi zh hu ko tr he Figure 1: X-FACTR contains 23 languages, for which the data availability varies dramatically.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…However, the introduction of BERT has essentially eliminated the need for static word vectors in standard settings. On the other hand, several authors have shown that it can be beneficial to incorporate entity vectors with BERT, allowing the model to exploit factual or commonsense knowledge from structured sources (Lin et al, 2019;Poerner et al, 2019).…”
Section: Related Workmentioning
confidence: 99%