Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.635
|View full text |Cite
|
Sign up to set email alerts
|

Exploring and Predicting Transferability across NLP Tasks

Abstract: Recent advances in NLP demonstrate the effectiveness of training large-scale language models and transferring them to downstream tasks. Can fine-tuning these models on tasks other than language modeling further improve performance? In this paper, we conduct an extensive study of the transferability between 33 NLP tasks across three broad classes of problems (text classification, question answering, and sequence labeling). Our results show that transfer learning is more beneficial than previously thought, espec… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
107
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 83 publications
(124 citation statements)
references
References 38 publications
1
107
0
Order By: Relevance
“…Nonetheless, determining the value of a source corpus is challenging as it is affected by many factors, including the quality of the source data, the amount of the source data, and the difference between source and target at lexical, syntax and semantics levels (Ahmad et al, 2019;Lin et al, 2019). The current source valuation or ranking methods are often based on single source transfer performance (McDonald et al, 2011;Lin et al, 2019;Vu et al, 2020) or leave-one-out approaches (Tommasi and Caputo, 2009;Li et al, 2016;Feng et al, 2018;Rahimi et al, 2019). They do not consider the combinations of the sources.…”
Section: Notationsmentioning
confidence: 99%
See 1 more Smart Citation
“…Nonetheless, determining the value of a source corpus is challenging as it is affected by many factors, including the quality of the source data, the amount of the source data, and the difference between source and target at lexical, syntax and semantics levels (Ahmad et al, 2019;Lin et al, 2019). The current source valuation or ranking methods are often based on single source transfer performance (McDonald et al, 2011;Lin et al, 2019;Vu et al, 2020) or leave-one-out approaches (Tommasi and Caputo, 2009;Li et al, 2016;Feng et al, 2018;Rahimi et al, 2019). They do not consider the combinations of the sources.…”
Section: Notationsmentioning
confidence: 99%
“…Transfer learning has been widely used in learning models for low-resource scenarios by leveraging the supervision provided in data-rich source corpora. It has been applied to NLP tasks in various settings including domain adaptation (Blitzer et al, 2007;Ruder and Plank, 2017), cross-lingual transfer (Täckström et al, 2013;Wu and Dredze, 2019), and task transfer (Liu et al, 2019b;Vu et al, 2020).…”
Section: Introductionmentioning
confidence: 99%
“…Another question that remains largely unexplored is whether this data shortage problem can instead be addressed by using training data in one or several other non-target languages. An intermediate training mechanism has been proposed ( Yogatama et al, 2019 ; Wang et al, 2019a ; Pruksachatkun et al, 2020 ; Vu et al, 2020 ) to reduce the need for large scale data for all tasks in all languages. In the intermediate training step, instead of fine-tuning the LM directly on the target language task, it is first trained on a similar task using the same or different language data.…”
Section: Introductionmentioning
confidence: 99%
“…Recent work shows the benefits of interspersing the pretraining and finetuning steps with an intermediate pretraining step (Phang et al, 2018), (Vu et al, 2020). This intermediate step often involves supervised pretraining using labeled datasets from different domains for a task that is related to or is the same as the target task.…”
Section: Introductionmentioning
confidence: 99%
“…While the efficacy of such pretraining approaches have been studied in prior work for natural language understanding tasks (like entailment, question answering, etc. (Vu et al, 2020)), the effect of pretraining on summarization has been far less explored.…”
Section: Introductionmentioning
confidence: 99%