2021
DOI: 10.1162/tacl_a_00410
|View full text |Cite|
|
Sign up to set email alerts
|

Measuring and Improving Consistency in Pretrained Language Models

Abstract: Consistency of a model—that is, the invariance of its behavior under meaning-preserving alternations in its input—is a highly desirable property in natural language processing. In this paper we study the question: Are Pretrained Language Models (PLMs) consistent with respect to factual knowledge? To this end, we create ParaRel🤘, a high-quality resource of cloze-style query English paraphrases. It contains a total of 328 paraphrases for 38 relations. Using ParaRel🤘, we show that the consistency of all PLMs we… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

2
59
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 95 publications
(113 citation statements)
references
References 55 publications
2
59
0
Order By: Relevance
“…However, constructing prompts from supervised knowledge extraction data risks learning new knowledge instead of recalling existing knowledge in an LM (Zhong et al, 2021). More recently, Elazar et al (2021a) introduced ParaRel, a curated dataset of paraphrased prompts and facts. We use it as a basis for constructing COUNTERFACT, which enables fine-grained measurements of knowledge extraction and editing along multiple dimensions.…”
Section: Extracting Knowledge From Lmsmentioning
confidence: 99%
See 2 more Smart Citations
“…However, constructing prompts from supervised knowledge extraction data risks learning new knowledge instead of recalling existing knowledge in an LM (Zhong et al, 2021). More recently, Elazar et al (2021a) introduced ParaRel, a curated dataset of paraphrased prompts and facts. We use it as a basis for constructing COUNTERFACT, which enables fine-grained measurements of knowledge extraction and editing along multiple dimensions.…”
Section: Extracting Knowledge From Lmsmentioning
confidence: 99%
“…Despite increasing adoption of this architecture, their knowledge representation remains under-explored. Research has been done for masked models (Petroni et al, 2019;Jiang et al, 2020;Elazar et al, 2021a;Geva et al, 2021;Dai et al, 2021;De Cao et al, 2021), but GPT's architectural differences (e.g., unidirectional attention, generation capabilities) provide an opportunity for new insights.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…As large-scale language models are gradually evolving towards more abstract inference, it is crucial to study and understand the underlying semantics encoded in their representation to identify biases and inconsistencies within the models (Elazar et al, 2021a), improve transparency (Thayaparan et al, 2020), and further investigate their generalisation and reasoning capabilities (Hu et al, 2020).…”
Section: Introductionmentioning
confidence: 99%
“…But beyond the intuition that patterns serve as some sort of task instruction (Schick and Schütze, 2021a), little is known about the reasons for their success. Recent findings that (i) PLMs can fail to follow even simple instructions (Efrat and Levy, 2020), that (ii) PLMs can behave drastically different with paraphrases of the same pattern (Elazar et al, 2021), and that (iii) performance increases if we train a second model to rewrite an input pattern with the goal of making it more comprehensible for a target PLM (Haviv et al, 2021), strongly suggest that patterns do not make sense to PLMs in the same way as they do to humans.…”
Section: Introductionmentioning
confidence: 99%