Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021
DOI: 10.18653/v1/2021.emnlp-main.243
|View full text |Cite
|
Sign up to set email alerts
|

The Power of Scale for Parameter-Efficient Prompt Tuning

Abstract: In this work, we explore "prompt tuning," a simple yet effective mechanism for learning "soft prompts" to condition frozen language models to perform specific downstream tasks. Unlike the discrete text prompts used by GPT-3, soft prompts are learned through backpropagation and can be tuned to incorporate signals from any number of labeled examples. Our end-to-end learned approach outperforms GPT-3's few-shot learning by a large margin. More remarkably, through ablations on model size using T5, we show that pro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

17
505
3
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 950 publications
(783 citation statements)
references
References 24 publications
17
505
3
1
Order By: Relevance
“…However, the selection of discrete prompts is still an independent process and cannot be optimized together with the downstream tasks in an end-to-end manner. To solve this issue, (Lester et al, 2021;Li and Liang, 2021) propose to use soft prompts, which are sets of trainable vectors, in the frozen pretrained language models. Unlike the hard prompts, these vectors do not correspond to any real words.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…However, the selection of discrete prompts is still an independent process and cannot be optimized together with the downstream tasks in an end-to-end manner. To solve this issue, (Lester et al, 2021;Li and Liang, 2021) propose to use soft prompts, which are sets of trainable vectors, in the frozen pretrained language models. Unlike the hard prompts, these vectors do not correspond to any real words.…”
Section: Related Workmentioning
confidence: 99%
“…However, finding suitable discrete task introductions cannot be easily optimized in an end-to-end fashion and requires extra human effort. In this paper, inspired by the recent work (Lester et al, 2021;Li and Liang, 2021), we replace the task introductions with Soft Prompt (i.e., a sequence of continuous and trainable vectors). During training, we only update the parameters of this Soft Prompt and fix all PLMs parameters.…”
Section: Prompt-based Learningmentioning
confidence: 99%
See 2 more Smart Citations
“…Specifically, we follow the definition of toxicity from Perspective API as well as the inspiration by the recent work from Prompt Engineering (Liu et al, 2021b;Shin et al, 2020;Li & Liang, 2021;Lester et al, 2021;Zhao et al, 2021;Schick & Schütze, 2020a; that repeating the prompts and prompting LMs in the format of Question Answering, and design the prompts below to study the generation and understanding power of the LMs, 1. Negative Prompt (for once).…”
Section: A41 Unconditional Generationmentioning
confidence: 99%