Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity

Lu, Yao; Bartolo, Max; Moore, A. S.; Riedel, Sebastian; Stenetorp, Pontus

doi:10.48550/arxiv.2104.08786

Cited by 51 publications

(72 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In-context learning has been the focus of significant study since its introduction. Prior work proposes better ways of formulating the problem (Zhao et al, 2021;Holtzman et al, 2021;Min et al, 2021a), better ways of choosing labeled examples for the demonstrations (Liu et al, 2021;Lu et al, 2021;Rubin et al, 2021), metatraining with an explicit in-context learning objective Min et al, 2021b), and learning to follow instructions as a variant of incontext learning (Mishra et al, 2021b;Efrat and Levy, 2020;Wei et al, 2022;Sanh et al, 2022). At the same time, some work reports brittleness and over-sensitivity for in-context learning (Lu et al, 2021;Zhao et al, 2021;Mishra et al, 2021a).…”

Section: Related Workmentioning

confidence: 99%

Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

Min¹,

Lyu²,

Holtzman³

et al. 2022

Preprint

View full text Add to dashboard Cite

Large language models (LMs) are able to incontext learn-perform a new task via inference alone by conditioning on a few inputlabel pairs (demonstrations) and making predictions for new inputs. However, there has been little understanding of how the model learns and which aspects of the demonstrations contribute to end task performance. In this paper, we show that ground truth demonstrations are in fact not required-randomly replacing labels in the demonstrations barely hurts performance, consistently over 12 different models including GPT-3. Instead, we find that other aspects of the demonstrations are the key drivers of end task performance, including the fact that they provide a few examples of (1) the label space, (2) the distribution of the input text, and (3) the overall format of the sequence. Together, our analysis provides a new way of understanding how and why in-context learning works, while opening up new questions about how much can be learned from large language models through inference alone.

show abstract

Section: Related Workmentioning

confidence: 99%

Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

Min¹,

Lyu²,

Holtzman³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…It is well known that prompt-based methods are sensitive to the many aspects of prompts including contexts (Jiang et al, 2020;Shin et al, 2020), orders (Lu et al, 2021) and length , and inappropriate prompts will cause bad performance.…”

Section: Effects Of Prompt Lengthmentioning

confidence: 99%

Black-box Prompt Learning for Pre-trained Language Models

Diao¹,

Li²,

Ye³

et al. 2022

Preprint

View full text Add to dashboard Cite

Domain-specific fine-tuning strategies for large pre-trained models received vast attention in recent years. In previously studied settings, the model architectures and parameters are tunable or at least visible, which we refer to as white-box settings. This work considers a new scenario, where we do not have access to a pre-trained model, except for its outputs given inputs, and we call this problem blackbox fine-tuning. To illustrate our approach, we first introduce the BLACK-BOX setting formally on text classification, where the pretrained model is not only frozen but also invisible. We then propose our solution BLACK-BOX prompt, a new technique in the promptlearning family, which can leverage the knowledge learned by pre-trained models from the pre-training corpus. Our experiments demonstrate that the proposed method achieved the state-of-the-art performance on eight datasets. Further analyses on different human-designed objectives, prompt lengths, and intuitive explanations demonstrate the robustness and flexibility of our method.

show abstract

“…Zhao et al (2021) propose to remove the model bias before using GPT-3, which not only increases the accuracy but also reduces the variance. Lu et al (2021) work on how to order the few labeled data as input of GPT-3 by constructing an artificial development set. One concurrent with our work, Yoo et al ( 2021) consider distilling knowledge from GPT-3 with synthetic data.…”

Section: Related Workmentioning

confidence: 99%

Want To Reduce Labeling Cost? GPT-3 Can Help

Wang¹,

Liu²,

Xu³

et al. 2021

Preprint

View full text Add to dashboard Cite

Data annotation is a time-consuming and labor-intensive process for many NLP tasks. Although there exist various methods to produce pseudo data labels, they are often taskspecific and require a decent amount of labeled data to start with. Recently, the immense language model GPT-3 with 175 billion parameters has achieved tremendous improvement across many few-shot learning tasks. In this paper, we explore ways to leverage GPT-3 as a low-cost data labeler to train other models. We find that, to make the downstream model achieve the same performance on a variety of NLU and NLG tasks, it costs 50% to 96% less to use labels from GPT-3 than using labels from humans. Furthermore, we propose a novel framework of combining pseudo labels from GPT-3 with human labels, which leads to even better performance with limited labeling budget. These results present a cost-effective data labeling methodology that is generalizable to many practical applications.

show abstract

Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity

Cited by 51 publications

References 11 publications

Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

Black-box Prompt Learning for Pre-trained Language Models

Want To Reduce Labeling Cost? GPT-3 Can Help

Contact Info

Product

Resources

About