Exploring and Predicting Transferability across NLP Tasks

Vu, Tu; Wang, Tong; Munkhdalai, Tsendsuren; Sordoni, Alessandro; Trischler, Adam; Mattarella-Micke, Andrew; Maji, Subhransu; Iyyer, Mohit

doi:10.18653/v1/2020.emnlp-main.635

Cited by 83 publications

(124 citation statements)

References 38 publications

Supporting

Mentioning

107

Contrasting

Order By: Relevance

“…Nonetheless, determining the value of a source corpus is challenging as it is affected by many factors, including the quality of the source data, the amount of the source data, and the difference between source and target at lexical, syntax and semantics levels (Ahmad et al, 2019;Lin et al, 2019). The current source valuation or ranking methods are often based on single source transfer performance (McDonald et al, 2011;Lin et al, 2019;Vu et al, 2020) or leave-one-out approaches (Tommasi and Caputo, 2009;Li et al, 2016;Feng et al, 2018;Rahimi et al, 2019). They do not consider the combinations of the sources.…”

Section: Notationsmentioning

confidence: 99%

“…Transfer learning has been widely used in learning models for low-resource scenarios by leveraging the supervision provided in data-rich source corpora. It has been applied to NLP tasks in various settings including domain adaptation (Blitzer et al, 2007;Ruder and Plank, 2017), cross-lingual transfer (Täckström et al, 2013;Wu and Dredze, 2019), and task transfer (Liu et al, 2019b;Vu et al, 2020).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Evaluating the Values of Sources in Transfer Learning

Parvez¹,

Chang²

2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

Transfer learning that adapts a model trained on data-rich sources to low-resource targets has been widely applied in natural language processing (NLP). However, when training a transfer model over multiple sources, not every source is equally useful for the target. To better transfer a model, it is essential to understand the values of the sources. In this paper, we develop SEAL-Shap, an efficient source valuation framework for quantifying the usefulness of the sources (e.g., domains/languages) in transfer learning based on the Shapley value method. Experiments and comprehensive analyses on both cross-domain and cross-lingual transfers demonstrate that our framework is not only effective in choosing useful transfer sources but also the source values match the intuitive source-target similarity.

show abstract

Section: Notationsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Evaluating the Values of Sources in Transfer Learning

Parvez¹,

Chang²

2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

show abstract

“…Another question that remains largely unexplored is whether this data shortage problem can instead be addressed by using training data in one or several other non-target languages. An intermediate training mechanism has been proposed ( Yogatama et al, 2019 ; Wang et al, 2019a ; Pruksachatkun et al, 2020 ; Vu et al, 2020 ) to reduce the need for large scale data for all tasks in all languages. In the intermediate training step, instead of fine-tuning the LM directly on the target language task, it is first trained on a similar task using the same or different language data.…”

Section: Introductionmentioning

confidence: 99%

Investigating cross-lingual training for offensive language detection

Pelicon

Shekhar

Škrlj

et al. 2021

PeerJ Computer Science

View full text Add to dashboard Cite

Platforms that feature user-generated content (social media, online forums, newspaper comment sections etc.) have to detect and filter offensive speech within large, fast-changing datasets. While many automatic methods have been proposed and achieve good accuracies, most of these focus on the English language, and are hard to apply directly to languages in which few labeled datasets exist. Recent work has therefore investigated the use of cross-lingual transfer learning to solve this problem, training a model in a well-resourced language and transferring to a less-resourced target language; but performance has so far been significantly less impressive. In this paper, we investigate the reasons for this performance drop, via a systematic comparison of pre-trained models and intermediate training regimes on five different languages. We show that using a better pre-trained language model results in a large gain in overall performance and in zero-shot transfer, and that intermediate training on other languages is effective when little target-language data is available. We then use multiple analyses of classifier confidence and language model vocabulary to shed light on exactly where these gains come from and gain insight into the sources of the most typical mistakes.

show abstract

“…Recent work shows the benefits of interspersing the pretraining and finetuning steps with an intermediate pretraining step (Phang et al, 2018), (Vu et al, 2020). This intermediate step often involves supervised pretraining using labeled datasets from different domains for a task that is related to or is the same as the target task.…”

Section: Introductionmentioning

confidence: 99%

“…While the efficacy of such pretraining approaches have been studied in prior work for natural language understanding tasks (like entailment, question answering, etc. (Vu et al, 2020)), the effect of pretraining on summarization has been far less explored.…”

Section: Introductionmentioning

confidence: 99%

Proceedings of the Second Workshop on Scholarly Document Processing

2021

View full text Add to dashboard Cite

Most work on scholarly document processing assumes that the information processed is trustworthy and factually correct. However, this is not always the case. There are two core challenges, which should be addressed: 1) ensuring that scientific publications are credible -e.g. that claims are not made without supporting evidence, and that all relevant supporting evidence is provided; and 2) that scientific findings are not misrepresented, distorted or outright misreported when communicated by journalists or the general public. I will present some first steps towards addressing these problems and outline remaining challenges. BiologyWood Frogs (Rana sylvatica) are a charismatic species of frog common in much of North America. They breed in explosive choruses over a few nights in late winter to early spring. The incidence in Wood Frogs was associated with a die-off of frogs during the breeding chorus in the Sylamore District of the Ozark National Forest in Arkansas (Trauth et al., 2000). Computer ScienceLand use or cover change is a direct reflection of human activity, such as land use, urban expansion, and architectural planning, on the earth's surface caused by urbanization [1]. Remote sensing images are important data sources that can efficiently detect land changes. Meanwhile, remote sensing image-based change detection is the change identification of surficial objects or geographic phenomena through the remote observation of two or more different phases [2].

show abstract

Exploring and Predicting Transferability across NLP Tasks

Cited by 83 publications

References 38 publications

Evaluating the Values of Sources in Transfer Learning

Evaluating the Values of Sources in Transfer Learning

Investigating cross-lingual training for offensive language detection

Proceedings of the Second Workshop on Scholarly Document Processing

Contact Info

Product

Resources

About