Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.38
|View full text |Cite
|
Sign up to set email alerts
|

Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks

Abstract: Self-supervised pre-training of transformer models has revolutionized NLP applications. Such pre-training with language modeling objectives provides a useful initial point for parameters that generalize well to new tasks with fine-tuning. However, fine-tuning is still data inefficient -when there are few labeled examples, accuracy can be low. Data efficiency can be improved by optimizing pre-training directly for future fine-tuning with few examples; this can be treated as a meta-learning problem. However, sta… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
84
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 64 publications
(84 citation statements)
references
References 28 publications
0
84
0
Order By: Relevance
“…Informed output layer initialization in Proto(FO)MAML is therefore important for effective learning in such scenarios. A similar problem with FOMAML is also pointed out by Bansal et al (2019), who design a differentiable parameter generator for the output layer.…”
Section: Discussionmentioning
confidence: 81%
See 2 more Smart Citations
“…Informed output layer initialization in Proto(FO)MAML is therefore important for effective learning in such scenarios. A similar problem with FOMAML is also pointed out by Bansal et al (2019), who design a differentiable parameter generator for the output layer.…”
Section: Discussionmentioning
confidence: 81%
“…Dou et al (2019) perform metatraining on certain high-resource tasks from the GLUE benchmark and metatest on certain low-resource tasks from the same benchmark. Bansal et al (2019) propose a softmax parameter generator component that can enable a varying number of classes in the meta-training tasks. They choose the tasks in GLUE along with SNLI (Bowman et al, 2015) for meta-training, and use entity typing, relation classification, sentiment classification, text categorization, and scientific NLI as the test tasks.…”
Section: Meta-learning In Nlpmentioning
confidence: 99%
See 1 more Smart Citation
“…The problem of zero shot, and few shot learning has lately been proposed in the context of NLP Geng et al, 2019) using meta-learning. Model agnostic meta-learning (MAML) has been explored to tackle tasks with disjoint label spaces (Bansal et al, 2019). However, these models are not capable of making zero shot predictions.…”
Section: Related Workmentioning
confidence: 99%
“…Few-shot transfer learning. Real world text classification scenarios are often characterized by a lack of annotated corpora and rapidly changing information needs (Chiticariu et al, 2013), motivating research into methods that allow us to train text classifiers for new classes with only a handful of training examples (Bansal et al, 2019;Yogatama et al, 2019). In such cases, a standard approach is to transfer knowledge from an existing model for classification task X to initialize the weights for a model for the new classification task Y.…”
Section: Introductionmentioning
confidence: 99%