2022
DOI: 10.48550/arxiv.2212.10560
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Self-Instruct: Aligning Language Model with Self Generated Instructions

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
53
0
2

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 68 publications
(104 citation statements)
references
References 0 publications
0
53
0
2
Order By: Relevance
“…Finally, we compare our random sampling method for our best setting to our diversity promoting sampling method described in Section 3.3 [46]. The bottom row of Table 6 shows that we get a slight improvement in SACC to 78.46% by sampling more diverse prompts as well as an improvement from 97.55% to 100% for personality accuracy after ranking (PAC AR).…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Finally, we compare our random sampling method for our best setting to our diversity promoting sampling method described in Section 3.3 [46]. The bottom row of Table 6 shows that we get a slight improvement in SACC to 78.46% by sampling more diverse prompts as well as an improvement from 97.55% to 100% for personality accuracy after ranking (PAC AR).…”
Section: Resultsmentioning
confidence: 99%
“…These are lower than fine-tuned models and thus, in a real setting, fine-tuned models would still need to be used, as well as the fact that the Jurassic model cannot be run in real time. Both of these limitations might be addressable by instruction tuning a smaller model for data-totext and stylistic control tasks such as we report here [31,46]. Another limitations is that we only tested our approach on two domains, and only on five personality styles.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…", we want the LM to generate the target "El nuevo edificio de oficinas se construyó en tres meses.". The template is commonly humanmade including unnatural instructions [112] and natural instructions [113,114], or bootstrap based on a seed corpus [115]. Ethical and social risks of harm from LMs are significant concerns in SFT [116].…”
Section: Instruction-aligning Methodsmentioning
confidence: 99%
“…However, whether the instruction following ability of LLMs is newly obtained through instruction tuning or is already obtained during pretraining is under-explored. Wang et al (2022b); Honovich et al (2022) show that downstream tasks generated by LLMs itself which contain noisy instances can actually be good training instances for instruction tuning, implying that LLMs are already somewhat aware of instructions. We extend this hypothesis that LLMs already have the capability to follow instructions by applying in-context learning to instruction learning which does not require any backpropagation, using the pretrained model checkpoint without any gradient update.…”
Section: Related Workmentioning
confidence: 99%