2021
DOI: 10.48550/arxiv.2110.08207
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Multitask Prompted Training Enables Zero-Shot Task Generalization

Abstract: Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks (Brown et al., 2020). It has been hypothesized that this is a consequence of implicit multitask learning in language model training . Can zero-shot generalization instead be directly induced by explicit multitask learning? To test this question at scale, we develop a system for easily mapping general natural language tasks into a human-readable prompted form. We convert a large set of supervis… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
129
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 86 publications
(149 citation statements)
references
References 22 publications
2
129
0
Order By: Relevance
“…Large language models can generalize to these unseen instructions, obtaining reasonable performance in a wide variety of tasks. Moreover, recent works Sanh et al, 2021) have shown that we can improve the performance of this instructionfollowing behavior by fine-tuning on a multi-task mixture using natural language descriptions of the tasks, which mirrors closely the results from the multilingual MT literature.…”
Section: Introductionsupporting
confidence: 78%
See 1 more Smart Citation
“…Large language models can generalize to these unseen instructions, obtaining reasonable performance in a wide variety of tasks. Moreover, recent works Sanh et al, 2021) have shown that we can improve the performance of this instructionfollowing behavior by fine-tuning on a multi-task mixture using natural language descriptions of the tasks, which mirrors closely the results from the multilingual MT literature.…”
Section: Introductionsupporting
confidence: 78%
“…In this formulation, we could formulate our prompts as Translate to {language_name}: {input_slot}. and Sanh et al (2021) have shown this ability to follow natural language instructions can be improved by finetuning the model on a diverse mixture of tasks. All of these works have typically focused on large, English-centric language models.…”
Section: Related Workmentioning
confidence: 99%
“…Motivated by this idea, we combine approaches from SOLOIST, MUPPET, and T0 for PrefineDST in an attempt to train a robust DST model through prefinetuning (Peng et al, 2020a;Aghajanyan et al, 2021;Sanh et al, 2021). We choose prefinetuning tasks based on their intuitive potential for improving on qualities measured by CheckDST and uniformly format these non-target datasets as text-totext generation tasks with the help of instruction prompts.…”
Section: Prefinedstmentioning
confidence: 99%
“…The most similar work to PrefineDST is MUPPET, a BART model prefinetuned on more than 50 heterogeneous tasks via additional layers that accommodate different task structures (Aghajanyan et al, 2021). We adapt the multitasking approach by Sanh et al (2021) and to remove the additional layers used by MUPPET.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation