Editing Factual Knowledge in Language Models

Cao, Nicola De; Aziz, Wilker; Titov, Ivan

doi:10.18653/v1/2021.emnlp-main.522

Cited by 62 publications

(106 citation statements)

References 36 publications

Supporting

Mentioning

103

Contrasting

Order By: Relevance

“…A simple approach to making such edits is additional fine-tuning with a new label on the single example to be corrected. Yet fine-tuning on a single example tends to overfit, even when constraining the distance between the pre-and post-fine-tuning parameters (Zhu et al, 2020;Cao et al, 2021). This overfitting leads to failures of both locality and generality.…”

Section: Introductionmentioning

confidence: 99%

Fast Model Editing at Scale

Mitchell¹,

Lin²,

Bosselut³

et al. 2021

Preprint

View full text Add to dashboard Cite

While large pre-trained models have enabled impressive results on a variety of downstream tasks, the largest existing models still make errors, and even accurate predictions may become outdated over time. Because detecting all such failures at training time is impossible, enabling both developers and end users of such models to correct inaccurate outputs while leaving the model otherwise intact is desirable. However, the distributed, black-box nature of the representations learned by large neural networks makes producing such targeted edits difficult. If presented with only a single problematic input and new desired output, fine-tuning approaches tend to overfit; other editing algorithms are either computationally infeasible or simply ineffective when applied to very large models. To enable easy post-hoc editing at scale, we propose Model Editor Networks with Gradient Decomposition (MEND), a collection of small auxiliary editing networks that use a single desired input-output pair to make fast, local edits to a pre-trained model. MEND learns to transform the gradient obtained by standard fine-tuning, using a low-rank decomposition of the gradient to make the parameterization of this transformation tractable. MEND can be trained on a single GPU in less than a day even for 10 billion+ parameter models; once trained MEND enables rapid application of new edits to the pre-trained model. Our experiments with T5, GPT, BERT, and BART models show that MEND is the only approach to model editing that produces effective edits for models with tens of millions to over 10 billion parameters. Implementation available at https://sites.google.com/view/mend-editing.

show abstract

Section: Introductionmentioning

confidence: 99%

Fast Model Editing at Scale

Mitchell¹,

Lin²,

Bosselut³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Our evaluations confirm a distinction between generalized knowing at the early MLP site and rote saying at the late selfattention site (Section 5.3). Furthermore, when compared to fine-tuning (Zhu et al, 2020) and meta-learning (Mitchell et al, 2021;De Cao et al, 2021), our benchmarks find that the explicitly localized ROME method avoids both generalization and specificity failures seen in other knowledge editing approaches, outperforming state-of-the-art opaque methods even at billion-parameter scale (Section 5.4).…”

Section: Introductionmentioning

confidence: 74%

“…Despite increasing adoption of this architecture, their knowledge representation remains under-explored. Research has been done for masked models (Petroni et al, 2019;Jiang et al, 2020;Elazar et al, 2021a;Geva et al, 2021;Dai et al, 2021;De Cao et al, 2021), but GPT's architectural differences (e.g., unidirectional attention, generation capabilities) provide an opportunity for new insights.…”

Section: Introductionmentioning

confidence: 99%

Locating and Editing Factual Associations in GPT

Meng¹,

Bau²,

Andonian³

et al. 2022

Preprint

View full text Add to dashboard Cite

We investigate the mechanisms underlying factual knowledge recall in autoregressive transformer language models. First, we develop a causal intervention for identifying neuron activations capable of altering a model's factual predictions. Within large GPT-style models, this reveals two distinct sets of neurons that we hypothesize correspond to knowing an abstract fact and saying a concrete word, respectively. This insight inspires the development of ROME, a novel method for editing facts stored in model weights. For evaluation, we assemble COUNTERFACT, a dataset of over twenty thousand counterfactuals and tools to facilitate sensitive measurements of knowledge editing. Using COUNTERFACT, we confirm the distinction between saying and knowing neurons, and we find that ROME achieves state-of-the-art performance in knowledge editing compared to other methods. An interactive demo notebook, full code implementation, and the dataset are available at https://rome.baulab.info/.

show abstract

“…A related line of work has explored editing neural predictions after training given a dataset of revised input and output pairs (Sinitsin et al, 2020;Zhu et al, 2020;Cao et al, 2021). Here we introduce a different setting where we have access to new unlabeled text after model training, which must be used implicitly to update the factual predictions of the model.…”

Section: Related Workmentioning

confidence: 99%

Time-Aware Language Models as Temporal Knowledge Bases

Dhingra,

Cole,

Eisenschlos

et al. 2021

Preprint

View full text Add to dashboard Cite

Many facts come with an expiration date, from the name of the President to the basketball team Lebron James plays for. But language models (LMs) are trained on snapshots of data collected at a specific moment in time, and this can limit their utility, especially in the closed-book setting where the pretraining corpus must contain the facts the model should memorize. We introduce a diagnostic dataset aimed at probing LMs for factual knowledge that changes over time and highlight problems with LMs at either end of the spectrum-those trained on specific slices of temporal data, as well as those trained on a wide range of temporal data. To mitigate these problems, we propose a simple technique for jointly modeling text with its timestamp. This improves memorization of seen facts from the training time period, as well as calibration on predictions about unseen facts from future time periods. We also show that models trained with temporal context can be efficiently "refreshed" as new data arrives, without the need for retraining from scratch.

show abstract

Editing Factual Knowledge in Language Models

Cited by 62 publications

References 36 publications

Fast Model Editing at Scale

Fast Model Editing at Scale

Locating and Editing Factual Associations in GPT

Time-Aware Language Models as Temporal Knowledge Bases

Contact Info

Product

Resources

About