Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2021
DOI: 10.18653/v1/2021.naacl-main.398
|View full text |Cite
|
Sign up to set email alerts
|

Factual Probing Is [MASK]: Learning vs. Learning to Recall

Abstract: Petroni et al. (2019) demonstrated that it is possible to retrieve world facts from a pretrained language model by expressing them as cloze-style prompts and interpret the model's prediction accuracy as a lower bound on the amount of factual information it encodes. Subsequent work has attempted to tighten the estimate by searching for better prompts, using a disjoint set of facts as training data. In this work, we make two complementary contributions to better understand these factual probing techniques. First… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
73
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 152 publications
(113 citation statements)
references
References 22 publications
(28 reference statements)
2
73
0
Order By: Relevance
“…The answer is simple. Typical language Prompt handcrafting (Petroni et al, 2019; Automatic prompt engineering (Jiang et al, 2020b;Shin et al, 2020;Zhong et al, 2021;Qin and Eisner, 2021) Adversarial prompt modification Poerner et al, 2020; Varying base prompts (Elazar et al, 2021;Heinzerling and Inui, 2021;Jiang et al, 2020a; Symbolic rule-based prompting Talmor et al, 2020a) Statement scores ( § 3.2) Single-LM scoring (Tamborrino et al, 2020;) Dual-LM scoring (Davison et al, 2019;Shwartz et al, 2020) modeling corpora like Wikipedia are known to contain KB-like assertions about the world (Da and Kasai, 2019). LMs trained on enough such data can be expected to acquire some KB-like knowledge, even without targeted entity-or relation-level supervision.…”
Section: Word-level Supervisionmentioning
confidence: 99%
See 2 more Smart Citations
“…The answer is simple. Typical language Prompt handcrafting (Petroni et al, 2019; Automatic prompt engineering (Jiang et al, 2020b;Shin et al, 2020;Zhong et al, 2021;Qin and Eisner, 2021) Adversarial prompt modification Poerner et al, 2020; Varying base prompts (Elazar et al, 2021;Heinzerling and Inui, 2021;Jiang et al, 2020a; Symbolic rule-based prompting Talmor et al, 2020a) Statement scores ( § 3.2) Single-LM scoring (Tamborrino et al, 2020;) Dual-LM scoring (Davison et al, 2019;Shwartz et al, 2020) modeling corpora like Wikipedia are known to contain KB-like assertions about the world (Da and Kasai, 2019). LMs trained on enough such data can be expected to acquire some KB-like knowledge, even without targeted entity-or relation-level supervision.…”
Section: Word-level Supervisionmentioning
confidence: 99%
“…Automatic prompt engineering is a promising alternative to prompt handcrafting for knowledge extraction in LMs (Liu et al, 2021a), as prompts engineered using discrete (Jiang et al, 2020b;Shin et al, 2020;Haviv et al, 2021) and continuous (Zhong et al, 2021;Qin and Eisner, 2021;Liu et al, 2021b) optimization have improved LMs' lower-bound performance on LAMA's underlying queries. Note, however, that optimized prompts are not always grammatical or intelligible (Shin et al, 2020).…”
Section: Cloze Promptingmentioning
confidence: 99%
See 1 more Smart Citation
“…Recently, many prompt-based works have emerged, i.e., manuallydesigned (Schick and Schütze, 2021a,b;Mishra et al, 2021) or automatically-searched (Jiang et al, 2020;Shin et al, 2020;Gao et al, 2021) hard prompts, which are discrete tokens but not necessarily human-readable. Furthermore, soft prompt (Li and Liang, 2021;Hambardzumyan et al, 2021;Zhong et al, 2021;Liu et al, 2021) comes out, which are tuneable embeddings rather than tokens in the vocabularies and can be directly trained with task-specific supervision. And demonstrates that this prompt tuning (PT) method can match the performance of full-parameter finetuning when the PLM size is extremely large.…”
Section: Introductionmentioning
confidence: 99%
“…We harness the knowledge present in large scale pre-trained language models (Davison et al, 2019;Zhou et al, 2020;Petroni et al, 2019;Zhong et al, 2021;Shin et al, 2020) to detect a rich set of biases. Our method prompts the LM with a textual post and labeled exemplars selected using a novel technique along with instructions to detect bias in this post.…”
Section: Introductionmentioning
confidence: 99%