Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021
DOI: 10.18653/v1/2021.emnlp-main.522
|View full text |Cite
|
Sign up to set email alerts
|

Editing Factual Knowledge in Language Models

Abstract: The factual knowledge acquired during pretraining and stored in the parameters of Language Models (LMs) can be useful in downstream tasks (e.g., question answering or textual inference). However, some facts can be incorrectly induced or become obsolete over time. We present KNOWLEDGEEDITOR, a method which can be used to edit this knowledge and, thus, fix 'bugs' or unexpected predictions without the need for expensive retraining or fine-tuning. Besides being computationally efficient, KNOWLEDGEEDITOR does not r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

3
103
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 62 publications
(106 citation statements)
references
References 36 publications
3
103
0
Order By: Relevance
“…A simple approach to making such edits is additional fine-tuning with a new label on the single example to be corrected. Yet fine-tuning on a single example tends to overfit, even when constraining the distance between the pre-and post-fine-tuning parameters (Zhu et al, 2020;Cao et al, 2021). This overfitting leads to failures of both locality and generality.…”
Section: Introductionmentioning
confidence: 99%
“…A simple approach to making such edits is additional fine-tuning with a new label on the single example to be corrected. Yet fine-tuning on a single example tends to overfit, even when constraining the distance between the pre-and post-fine-tuning parameters (Zhu et al, 2020;Cao et al, 2021). This overfitting leads to failures of both locality and generality.…”
Section: Introductionmentioning
confidence: 99%
“…Our evaluations confirm a distinction between generalized knowing at the early MLP site and rote saying at the late selfattention site (Section 5.3). Furthermore, when compared to fine-tuning (Zhu et al, 2020) and meta-learning (Mitchell et al, 2021;De Cao et al, 2021), our benchmarks find that the explicitly localized ROME method avoids both generalization and specificity failures seen in other knowledge editing approaches, outperforming state-of-the-art opaque methods even at billion-parameter scale (Section 5.4).…”
Section: Introductionmentioning
confidence: 74%
“…Despite increasing adoption of this architecture, their knowledge representation remains under-explored. Research has been done for masked models (Petroni et al, 2019;Jiang et al, 2020;Elazar et al, 2021a;Geva et al, 2021;Dai et al, 2021;De Cao et al, 2021), but GPT's architectural differences (e.g., unidirectional attention, generation capabilities) provide an opportunity for new insights.…”
Section: Introductionmentioning
confidence: 99%
“…A related line of work has explored editing neural predictions after training given a dataset of revised input and output pairs (Sinitsin et al, 2020;Zhu et al, 2020;Cao et al, 2021). Here we introduce a different setting where we have access to new unlabeled text after model training, which must be used implicitly to update the factual predictions of the model.…”
Section: Related Workmentioning
confidence: 99%