2018
DOI: 10.48550/arxiv.1811.01088
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

8
178
2

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 115 publications
(188 citation statements)
references
References 0 publications
8
178
2
Order By: Relevance
“…Within the context of transfer learning, intermediate-task training refers to fine-tuning a pre-trained model on an intermediate task before fine-tuning it on a final target task. This has been found to provide an additional improvement to target task performance compared to using the pre-trained model alone (Vu et al, 2020;Phang et al, 2018). We provide an analog of this in our merging framework by merging a model fine-tuned on the target task with a model fine-tuned on the intermediate task.…”
Section: Intermediate-task Trainingmentioning
confidence: 98%
See 1 more Smart Citation
“…Within the context of transfer learning, intermediate-task training refers to fine-tuning a pre-trained model on an intermediate task before fine-tuning it on a final target task. This has been found to provide an additional improvement to target task performance compared to using the pre-trained model alone (Vu et al, 2020;Phang et al, 2018). We provide an analog of this in our merging framework by merging a model fine-tuned on the target task with a model fine-tuned on the intermediate task.…”
Section: Intermediate-task Trainingmentioning
confidence: 98%
“…In computer vision, pre-training is typically done on a large labeled dataset like ImageNet (Deng et al, 2009;Russakovsky et al, 2015), whereas applications of transfer learning to natural language processing typically pre-train through self-supervised training on a large unlabeled text corpus. Recently, it has been shown that training on an "intermediate" task between pre-training and fine-tuning can further boost performance (Phang et al, 2018;Vu et al, 2020;Pruksachatkun et al, 2020;Phang et al, 2020). Alternatively, continued self-supervised training on unlabelled domain-specialized data can serve as a form of domain adaptation (Gururangan et al, 2020).…”
Section: Introductionmentioning
confidence: 99%
“…We are interested in whether such a similarity between QA tasks and sequencepair text classification tasks can make a difference. In terms of training procedure, we follow previous works (Phang et al, 2018;Vu et al, 2020). Specifically, we first fine-tune a pre-trained LM on SQuAD-2.0 (intermediate training stage) and then fine-tune it on each text classification tasks.…”
Section: Methodsmentioning
confidence: 99%
“…Another effective transfer learning approach named intermediate training that chooses to train a LM model on an intermediate task via supervised manner and then fine-tune it on target tasks. This also leads to promising results across various NLP tasks including text classification, QA and sequence labeling (Phang et al, 2018;Vu et al, 2020;Pruksachatkun et al, 2020).…”
Section: Introductionmentioning
confidence: 93%
“…Transfer Learning A large body of work has attemped to leveraged multi-task learning to endow a model with an inductive bias that improves generalization on a main task of interest (Caruana, 1998;Bakker & Heskes, 2003;Raffel et al, 2020), with recent work in NLP sharing our focus on neural networks (Sogaard & Goldberg 2016;Hashimoto et al 2016;Swayamdipta et al 2018; for a review, see Ruder 2017). Intermediate training of pre-trained sentence encoders on a task or a set of tasks that are related to the task of interest has been advocated, among others, by Phang et al (2018) and Aghajanyan et al (2021). Gururangan et al (2020) craft a training pipeline where a pre-trained language model is adapted to domain-specific and task-specific ones.…”
Section: Related Workmentioning
confidence: 99%