Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022
DOI: 10.18653/v1/2022.acl-long.53
|View full text |Cite
|
Sign up to set email alerts
|

Meta-learning via Language Model In-context Tuning

Abstract: The goal of meta-learning is to learn to adapt to a new task with only a few labeled examples. Inspired by the recent progress in large language models, we propose in-context tuning (ICT), which recasts task adaptation and prediction as a simple sequence prediction problem: to form the input sequence, we concatenate the task instruction, labeled in-context examples, and the target input to predict; to metatrain the model to learn from in-context examples, we fine-tune a pre-trained language model (LM) to predi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(10 citation statements)
references
References 40 publications
(46 reference statements)
0
6
0
Order By: Relevance
“…However, using humans to create open-domain instruction datasets like OpenAI did will encounter the following challenges. The whole annotating process is extremely expensive and time-consuming [18][19][20][21]. On the other hand, the difficulty level distribution of human-created instructions is skewed towards being easy or moderate, with fewer difficult ones (according to the difficulty statistics of ShareGPT [22] from Figure 7a).…”
Section: Instruction Learningmentioning
confidence: 99%
“…However, using humans to create open-domain instruction datasets like OpenAI did will encounter the following challenges. The whole annotating process is extremely expensive and time-consuming [18][19][20][21]. On the other hand, the difficulty level distribution of human-created instructions is skewed towards being easy or moderate, with fewer difficult ones (according to the difficulty statistics of ShareGPT [22] from Figure 7a).…”
Section: Instruction Learningmentioning
confidence: 99%
“…ProtEx is also related to in-context tuning methods for few-shot tasks [43, 13], where pretrained language models are meta-trained to make predictions given an input and task-relevant exemplars. These works show strong performance on unseen tasks, enabled by the LM’s ability to make predictions from an input and a few in-context exemplars.…”
Section: Background and Related Workmentioning
confidence: 99%
“…Additionally, researchers should evaluate the integrity of the labeling process (i.e., the process that determines which items belong to which labels (e.g., Mirończuk & Protasiewicz, 2018). If mislabeled, scale items are likely to hurt model performance (e.g., Chen et al, 2022; Phang et al, 2019; Saarikoski et al, 2015; Schick & Schütze, 2021). In the same vein, researchers may be motivated to include items that are indirectly related to the dimension labels of interest to obtain a larger number of items for training (e.g., collecting popular scales used in clinical psychology and labeling them as “neuroticism” items or collecting “extraversion” items from leadership scales).…”
Section: Demonstration: Training Transformers To Classify Personality...mentioning
confidence: 99%