Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity

Lu, Yao; Bartolo, Max; Moore, Alastair; Riedel, Sebastian; Stenetorp, Pontus

doi:10.18653/v1/2022.acl-long.556

Cited by 192 publications

(175 citation statements)

References 21 publications

Supporting

Mentioning

170

Contrasting

Order By: Relevance

“…Prompt construction requires a non-trivial combinatorial search over the prompt's wording, whether to include training examples, and how to convert LM probabilities to class predictions. As a consequence, prompts are either designed using human intuition that is hard to replicate and apply in a principled manner (Perez et al, 2021), or using automated methods (Shin et al, 2020;Gao et al, 2021;Lu et al, 2021). These methods search for elements such as: (1) the text of the pattern, (2) the tokens in the verbalizers, and (3) whether and how training examples are prepended before the test input.…”

Section: Constructing the Promptmentioning

confidence: 99%

“…Thus far, we have shown that prompt-based finetuning can simplify prompt engineering at the cost of memory inefficiency-a new set of parameters must be learned for each task. This is in contrast to in-context learning, which holds all model weights fixed but is heavily influenced by small prompt modifications (Zhao et al, 2021;Lu et al, 2021). In this section, we investigate how to achieve both memory efficiency and simple prompts.…”

Section: Achieving Simplicity and Efficiencymentioning

confidence: 99%

“…In stateof-the-art NLP, few-shot learning is performed by reformulating tasks as natural language "prompts" and completing those prompts with pre-trained language models (Brown et al, 2020;Schick and Schütze, 2021a). Prompts that are well-designed can substantially improve accuracy (Zhao et al, 2021;Lu et al, 2021). However, finding these prompts is difficult: it requires a non-trivial combinatorial search over the prompt's wording (a.k.a.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models

Logan¹,

Balažević²,

Wallace³

et al. 2022

Findings of the Association for Computational Linguistics: ACL 2022

Self Cite

View full text Add to dashboard Cite

Prompting language models (LMs) with training examples and task descriptions has been seen as critical to recent successes in few-shot learning. In this work, we show that finetuning LMs in the few-shot setting can considerably reduce the need for prompt engineering. In fact, one can use null prompts, prompts that contain neither task-specific templates nor training examples, and achieve competitive accuracy to manually-tuned prompts across a wide range of tasks. While finetuning LMs does introduce new parameters for each downstream task, we show that this memory overhead can be substantially reduced-finetuning only the bias terms can achieve comparable or better accuracy than standard finetuning while only updating 0.1% of the parameters. All in all, we recommend finetuning LMs for few-shot learning as it is more accurate, has relatively stable performance across different prompts, and can be made nearly as efficient as using frozen LMs.

show abstract

Section: Constructing the Promptmentioning

confidence: 99%

Section: Achieving Simplicity and Efficiencymentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models

Logan¹,

Balažević²,

Wallace³

et al. 2022

Findings of the Association for Computational Linguistics: ACL 2022

Self Cite

View full text Add to dashboard Cite

show abstract

“…Our method provides a way to collect large amounts of training data using a small set of labeled seed examples, and allows for more direct control over what the model learns compared to relatively brittle prompts (Lu et al, 2021). Yet, unlike model Summary: The Scottish city of Edinburgh is looking to crack down on so-called "silent disco" walking tours as residents complain they make too much noise.…”

Section: Introductionmentioning

confidence: 99%

Few-shot Mining of Naturally Occurring Inputs and Outputs

Joshi¹,

Blevins²,

Lewis³

et al. 2022

Preprint

View full text Add to dashboard Cite

Creating labeled natural language training data is expensive and requires significant human effort. We mine input output examples from large corpora using a supervised mining function trained using a small seed set of only 100 examples. The mining consists of two stages -(1) a biencoder-based recall-oriented dense search which pairs inputs with potential outputs, and (2) a crossencoder-based filter which re-ranks the output of the biencoder stage for better precision. Unlike model-generated data augmentation, our method mines naturally occurring high-quality input output pairs to mimic the style of the seed set for multiple tasks. On SQuAD-style reading comprehension, augmenting the seed set with the mined data results in an improvement of 13 F1 over a BART-large baseline fine-tuned only on the seed set. Likewise, we see improvements of 1.46 ROUGE-L on Xsum abstractive summarization.

show abstract

“…Notably, pre-trained language models (PLMs) have learned a substantial amount of in-depth knowledge from data, and have archived tremendous promise in few-shot/zero-shot learning ability with the natural language prompts [12,48,54]. However, Recent studies [35,37,56] observe that prompt learning with PLMs usually generalizes unstably in an extremely low-resource setting or emerging domains. One potential reason is that, it is non-trivial for parametric models to learn rare or hard patterns well with rote memorization, thus, resulting in inefficient generalizable performance.…”

Section: Introductionmentioning

confidence: 99%

Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning

Chen¹,

Li²,

Zhang³

et al. 2022

Preprint

View full text Add to dashboard Cite

Prompt learning approaches have made waves in natural language processing by inducing better few-shot performance while they still follow a parametric-based learning paradigm; the oblivion and rote memorization problems in learning may encounter unstable generalization issues. Specifically, vanilla prompt learning may struggle to utilize atypical instances by rote during fully-supervised training or overfit shallow patterns with low-shot data. To alleviate such limitations, we develop RETROPROMPT with the motivation of decoupling knowledge from memorization to help the model strike a balance between generalization and memorization. In contrast with vanilla prompt learning, RETROPROMPT constructs an open-book knowledge-store from training instances and implements a retrieval mechanism during the process of input, training and inference, thus equipping the model with the ability to retrieve related contexts from the training corpus as cues for enhancement. Extensive experiments demonstrate that RETROPROMPT can obtain better performance in both few-shot and zero-shot settings. Besides, we further illustrate that our proposed RETROPROMPT can yield better generalization abilities with new datasets. Detailed analysis of memorization indeed reveals RETROPROMPT can reduce the reliance of language models on memorization; thus, improving generalization for downstream tasks.

show abstract

Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity

Cited by 192 publications

References 21 publications

Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models

Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models

Few-shot Mining of Naturally Occurring Inputs and Outputs

Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning

Contact Info

Product

Resources

About