Self-Instruct: Aligning Language Model with Self Generated Instructions

Wang, Yizhong; Kordi, Yeganeh; Mishra, Swaroop; Liu, Alisa; Smith, Noah A.; Khashabi, Daniel; Hajishirzi, Hannaneh

doi:10.48550/arxiv.2212.10560

Cited by 68 publications

(104 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Finally, we compare our random sampling method for our best setting to our diversity promoting sampling method described in Section 3.3 [46]. The bottom row of Table 6 shows that we get a slight improvement in SACC to 78.46% by sampling more diverse prompts as well as an improvement from 97.55% to 100% for personality accuracy after ranking (PAC AR).…”

Section: Resultsmentioning

confidence: 99%

“…These are lower than fine-tuned models and thus, in a real setting, fine-tuned models would still need to be used, as well as the fact that the Jurassic model cannot be run in real time. Both of these limitations might be addressable by instruction tuning a smaller model for data-totext and stylistic control tasks such as we report here [31,46]. Another limitations is that we only tested our approach on two domains, and only on five personality styles.…”

Section: Discussionmentioning

confidence: 99%

“…We also experiment with whether we can create a "control knob" for personality by presenting all 5 types of personalities in some prompts and only a single personality in other prompts. Table 1 In addition, once we determine a good setting for type of prompt and number and type of examples, we build on previous work using a diversity criterion for selecting prompts for instruction tuning [46]. We hypothesized that creating prompt examples using a diversity criteria might lead to better performance.…”

Section: Prompt Formats Prompt Sampling and Prompt Selectionmentioning

confidence: 99%

See 2 more Smart Citations

Controlling Personality Style in Dialogue with Zero-Shot Prompt-Based Learning

Ramirez¹,

Alsalihy²,

Aggarwal³

et al. 2023

Preprint

View full text Add to dashboard Cite

Prompt-based or in-context learning has been shown to achieve high zeroshot performance on many natural language generation (NLG) tasks. Here we explore the performance of prompt-based learning for simultaneously controlling the personality and the semantic accuracy of an NLG for task-oriented dialogue. We experiment with prompt-based learning on the PERSONAGE restaurant recommendation corpus to generate semantically and stylistically-controlled text for 5 different Big-5 personality types: agreeable, disagreeable, conscientious, unconscientious, and extravert. We test two different classes of discrete prompts to generate utterances for a particular personality style: (1) prompts that demonstrate generating directly from a meaning representation that includes a personality specification; and (2) prompts that rely on first converting the meaning representation to a textual pseudo-reference, and then using the pseudo-reference in a textual style transfer (TST) prompt. In each case, we show that we can vastly improve performance by over-generating outputs and ranking them, testing several ranking functions based on automatic metrics for semantic accuracy, personality-match, and fluency. We also test the effect of providing examples of multiple personalities, and of different sampling strategies and numbers of examples, as well as testing whether NLG personality demonstrations from the restaurant domain can be used with meaning representations for the video game domain to generate personality stylized utterances about video games. Our findings show that the TST prompts produces the highest semantic accuracy (78.46% for restaurants and 87.6% for video games) and personality accuracy (100% for restaurants and 97% for video games). Our results on transferring personality style to video game utterances are surprisingly good. To our knowledge, there is no previous work testing the application of prompt-based learning to simultaneously controlling both style and semantic accuracy in NLG.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Section: Prompt Formats Prompt Sampling and Prompt Selectionmentioning

confidence: 99%

See 1 more Smart Citation

Controlling Personality Style in Dialogue with Zero-Shot Prompt-Based Learning

Ramirez¹,

Alsalihy²,

Aggarwal³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…", we want the LM to generate the target "El nuevo edificio de oficinas se construyó en tres meses.". The template is commonly humanmade including unnatural instructions [112] and natural instructions [113,114], or bootstrap based on a seed corpus [115]. Ethical and social risks of harm from LMs are significant concerns in SFT [116].…”

Section: Instruction-aligning Methodsmentioning

confidence: 99%

A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

Zhou¹,

Li²,

Li³

et al. 2023

Preprint

View full text Add to dashboard Cite

The Pretrained Foundation Models (PFMs) are regarded as the foundation for various downstream tasks with different data modalities. A pretrained foundation model, such as BERT, GPT-3, MAE, DALLE-E, and ChatGPT, is trained on large-scale data which provides a reasonable parameter initialization for a wide range of downstream applications. The idea of pretraining behind PFMs plays an important role in the application of large models. As a type of transfer learning paradigm, pretraining is applied in computer vision with frozen and fine-tuning techniques, showing promising performance. Word embedding in the natural language process can be also regarded as a type of pertaining, but it suffers many problems such as polysemy. Different from previous methods that apply convolution and recurrent modules for feature extractions, the generative pre-training (GPT) method applies Transformer as the feature extractor and is trained on large datasets with an autoregressive paradigm. Similarly, the BERT apples transformers to train on large datasets as a contextual language model. Recently, the ChatGPT shows promising success on large language models, which applies an autoregressive language model with zero shot or few show prompting. With the extraordinary success of PFMs, AI has made waves in a variety of fields over the past few years. Considerable methods, datasets, and evaluation metrics have been proposed in the literature, the need is raising for an updated survey.This study provides a comprehensive review of recent research advancements, current and future challenges, and opportunities for PFMs in text, image, graph, as well as other data modalities. We first review the basic components and existing pretraining in natural language processing, computer vision, and graph learning. We then discuss other advanced PFMs for other data modalities and unified PFMs considering the data quality and quantity. Besides, we discuss relevant research about the fundamentals of the PFM, including model efficiency and compression, security, and privacy. Finally, we lay out key implications, future research directions, challenges, and open problems. We hope this survey can shed light on the research of PFMs on the scalability, reasoning ability, cross-domain ability, user-friendly interactive ability, security and privacy-preserving ability for artificial general intelligence.

show abstract

“…However, whether the instruction following ability of LLMs is newly obtained through instruction tuning or is already obtained during pretraining is under-explored. Wang et al (2022b); Honovich et al (2022) show that downstream tasks generated by LLMs itself which contain noisy instances can actually be good training instances for instruction tuning, implying that LLMs are already somewhat aware of instructions. We extend this hypothesis that LLMs already have the capability to follow instructions by applying in-context learning to instruction learning which does not require any backpropagation, using the pretrained model checkpoint without any gradient update.…”

Section: Related Workmentioning

confidence: 99%

In-Context Instruction Learning

Seonghyeon¹,

Hyeonbin²,

Yang³

et al. 2023

Preprint

View full text Add to dashboard Cite

Instruction learning of Large Language Models (LLMs) has enabled zero-shot task generalization. However, instruction learning has been predominantly approached as a fine-tuning problem, including instruction tuning and reinforcement learning from human feedback, where LLMs are multi-task fine-tuned on various tasks with instructions. In this paper, we present a surprising finding that applying in-context learning to instruction learning, referred to as In-Context Instruction Learning (ICIL), significantly improves the zero-shot task generalization performance for both pretrained and instruction-fine-tuned models. One of the core advantages of ICIL is that it uses a single fixed prompt to evaluate all tasks, which is a concatenation of cross-task demonstrations. In particular, we demonstrate that the most powerful instruction-fine-tuned baseline (text-davinci-003) also benefits from ICIL by 9.3%, indicating that the effect of ICIL is complementary to instructionbased fine-tuning 2 . * Work done while interning at LG AI Research. 2 All experiments are reproducible from github.com/seonghyeonye/ICIL. Preprint. Under review.

show abstract

Self-Instruct: Aligning Language Model with Self Generated Instructions

Cited by 68 publications

References 0 publications

Controlling Personality Style in Dialogue with Zero-Shot Prompt-Based Learning

Controlling Personality Style in Dialogue with Zero-Shot Prompt-Based Learning

A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

In-Context Instruction Learning

Contact Info

Product

Resources

About