Learning to Few-Shot Learn Across Diverse Natural Language Classification Tasks

Bansal, Trapit; Jha, Rishikesh; McCallum, Andrew

doi:10.18653/v1/2020.coling-main.448

Cited by 68 publications

(101 citation statements)

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Pre-trained language models have also been applied to few-shot text classification. LEOPARD (Bansal et al, 2020) uses BERT (Devlin et al, 2019) with optimization-based metalearning framework to achieve good performance on diverse NLP classification tasks. More recently, GPT-3 (Brown et al, 2020) shows that the language model itself can be used to perform few-shot text classification without using meta-learning.…”

Section: Related Workmentioning

confidence: 99%

Don’t Miss the Labels: Label-semantic Augmented Meta-Learner for Few-Shot Text Classification

Luo¹,

Liu²,

Lin³

et al. 2021

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

View full text Add to dashboard Cite

Increasing studies leverage pre-trained language models and meta-learning frameworks to solve few-shot text classification problems. Most of the current studies focus on building a meta-learner from the information of input texts but ignore abundant semantic information beneath class labels. In this work, we show that class-label information can be utilized for extracting more discriminative feature representation of the input text from a pretrained language model like BERT, and can achieve a performance boost when the samples are scarce. Building on top of this discovery, we propose a framework called Labelsemantic augmented meta-learner (LaSAML) to make full use of label semantics. We systematically investigate various factors in this framework and show that it can be plugged into the existing few-shot text classification system. Through extensive experiments, we demonstrate that the few-shot text classification system upgraded by LaSAML can lead to significant performance improvement over its original counterparts.

show abstract

Section: Related Workmentioning

confidence: 99%

Don’t Miss the Labels: Label-semantic Augmented Meta-Learner for Few-Shot Text Classification

Luo¹,

Liu²,

Lin³

et al. 2021

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

View full text Add to dashboard Cite

show abstract

“…Since these applications come with well-defined task distributions, they do not have the same overfitting challenges. On the other hand, many works deal with few-shot adaptation in settings with no clear task distribution (Dou et al, 2019;Bansal et al, 2020a) but do not address meta-overfitting, and thus are complementary to our work.…”

Section: Related Workmentioning

confidence: 99%

“…On GLUE-SciTail, we compare against SMLMT (Bansal et al, 2020b) and find that MAML-DRECA improves over it by 1.5 accuracy points. However, we note that the confidence intervals of these approaches overlap, and also that (Bansal et al, 2020a) consider the entire GLUE data to train the meta-learner whereas we only consider NLI datasets within GLUE. Table 3: Results on NLI few-shot learning.…”

Section: Modelsmentioning

confidence: 99%

DReCa: A General Task Augmentation Strategy for Few-Shot Natural Language Inference

Murty

Hashimoto

Manning

2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

Meta-learning promises few-shot learners that quickly adapt to new distributions by repurposing knowledge acquired from previous training. However, we believe meta-learning has not yet succeeded in NLP due to the lack of a well-defined task distribution, leading to attempts that treat datasets as tasks. Such an ad hoc task distribution causes problems of quantity and quality. Since there's only a handful of datasets for any NLP problem, meta-learners tend to overfit their adaptation mechanism and, since NLP datasets are highly heterogeneous, many learning episodes have poor transfer between their support and query sets, which discourages the meta-learner from adapting. To alleviate these issues, we propose DRECA (Decomposing datasets into Reasoning Categories), a simple method for discovering and using latent reasoning categories in a dataset, to form additional high quality tasks. DRECA works by splitting examples into label groups, embedding them with a finetuned BERT model and then clustering each group into reasoning categories. Across four few-shot NLI problems, we demonstrate that using DRECA improves the accuracy of meta-learners by 1.5-4%.

show abstract

“…Text classification has a vast spectrum of applications, such as sentiment classification and intent classification. The meta-learning algorithms developed for image classification can be applied to text classification with slight modification to incorporate domain knowledge in each application Tan et al, 2019;Geng et al, 2019;Dou et al, 2019;Bansal et al, 2019).…”

Section: Text Classificationmentioning

confidence: 99%

Meta Learning and Its Applications to Natural Language Processing

Lee¹,

Vu²,

Li³

2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

Deep learning has been the mainstream technique in natural language processing (NLP) area. However, the techniques require many labeled data and are less generalizable across domains. Meta-learning is an arising field in machine learning studying approaches to learn better learning algorithms. Approaches aim at improving algorithms in various aspects, including data efficiency and generalizability. Efficacy of approaches has been shown in many NLP tasks, but there is no systematic survey of these approaches in NLP, which hinders more researchers from joining the field. Our goal with this survey paper is to offer researchers pointers to relevant meta-learning works in NLP and attract more attention from the NLP community to drive future innovation. This paper first introduces the general concepts of meta-learning and the common approaches. Then we summarize task construction settings and application of meta-learning for various NLP problems and review the development of meta-learning in NLP community.

show abstract

Learning to Few-Shot Learn Across Diverse Natural Language Classification Tasks

Cited by 68 publications

References 45 publications

Don’t Miss the Labels: Label-semantic Augmented Meta-Learner for Few-Shot Text Classification

Don’t Miss the Labels: Label-semantic Augmented Meta-Learner for Few-Shot Text Classification

DReCa: A General Task Augmentation Strategy for Few-Shot Natural Language Inference

Meta Learning and Its Applications to Natural Language Processing

Contact Info

Product

Resources

About