Proceedings of the 5th Workshop on Representation Learning for NLP 2020
DOI: 10.18653/v1/2020.repl4nlp-1.24
|View full text |Cite
|
Sign up to set email alerts
|

What’s in a Name? Are BERT Named Entity Representations just as Good for any other Name?

Abstract: We evaluate named entity representations of BERT-based NLP models by investigating their robustness to replacements from the same typed class in the input. We highlight that on several tasks while such perturbations are natural, state of the art trained models are surprisingly brittle. The brittleness continues even with the recent entity-aware BERT models. We also try to discern the cause of this non-robustness, considering factors such as tokenization and frequency of occurrence. Then we provide a simple met… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 17 publications
(13 citation statements)
references
References 25 publications
0
11
0
Order By: Relevance
“…Out-of-the-box BERT is surprisingly brittle to named entity replacements: For example, replacing names in the coreference task changes 85% of predictions (Balasubramanian et al, 2020). This suggests that the model does not actually form a generic idea of named entities, although its F1 scores on NER probing tasks are high (Tenney et al, 2019a).…”
Section: Semantic Knowledgementioning
confidence: 99%
“…Out-of-the-box BERT is surprisingly brittle to named entity replacements: For example, replacing names in the coreference task changes 85% of predictions (Balasubramanian et al, 2020). This suggests that the model does not actually form a generic idea of named entities, although its F1 scores on NER probing tasks are high (Tenney et al, 2019a).…”
Section: Semantic Knowledgementioning
confidence: 99%
“…Recent works have noted that over-relying on entity name information negatively impacts NLU tasks. Balasubramanian et al (2020) found that substituting named-entities in standard test sets of natural language inference, coreference resolution, and grammar error correction has a negative impact on those tasks. In political claims detection (Padó et al, 2019), Dayanik and Padó (2020) show that claims made by frequently occurring politicians in the training data are better recognized than those made by less frequent ones.…”
Section: Diagnosing Biaismentioning
confidence: 99%
“…As an example, BLESS (Baroni and Lenci, 2011) includes goldstandard annotations for only 200 concepts. Other approaches have tested semantic ability by using prompt-engineering and inspecting the predictions of the models (Petroni et al, 2019;Ettinger, 2020;Talmor et al, 2020), but other works have also shown a high variability in the results depending on the prompt design (Balasubramanian et al, 2020;Reynolds and McDonell, 2021;Zhao et al, 2021).…”
Section: Automated Extraction Of Concept Relationsmentioning
confidence: 99%