2020
DOI: 10.48550/arxiv.2010.07835
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach

Abstract: Fine-tuned pre-trained language models (LMs) achieve enormous success in many natural language processing (NLP) tasks, but they still require excessive labeled data in the fine-tuning stage. We study the problem of fine-tuning pre-trained LMs using only weak supervision, without any labeled data. This problem is challenging because the high capacity of LMs makes them prone to overfitting the noisy labels generated by weak supervision. To address this problem, we develop a contrastive self-training framework, C… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(19 citation statements)
references
References 39 publications
0
16
0
Order By: Relevance
“…Based on the study, future work may concentrate not only on data-centric research but also on the trade-off between performance and robustness considering classification based on small corpora to mitigate the effect of adversarial behavior. Additionally, future research could concentrate on how to fine-tune models with weak supervision: it could be expensive when collecting more data as well as label them all, and more studies have focused on the relevant contents (Yu et al, 2020;Awasthi et al, 2020).…”
Section: Discussionmentioning
confidence: 99%
“…Based on the study, future work may concentrate not only on data-centric research but also on the trade-off between performance and robustness considering classification based on small corpora to mitigate the effect of adversarial behavior. Additionally, future research could concentrate on how to fine-tune models with weak supervision: it could be expensive when collecting more data as well as label them all, and more studies have focused on the relevant contents (Yu et al, 2020;Awasthi et al, 2020).…”
Section: Discussionmentioning
confidence: 99%
“…Weak Supervision Methods: (1) COSINE (Yu et al, 2020) The COSINE method uses weakly labeled data to fine-tune pre-trained language models by contrastive self-training.…”
Section: Baselinesmentioning
confidence: 99%
“…Weak Supervision. Weak supervision aims to reduce the cost of annotation, and has been widely applied to perform both classification (Ratner et al, 2016b(Ratner et al, , 2019aFu et al, 2020;Yu et al, 2020; and sequence tagging (Lison et al, 2020;Nguyen et al, 2017;Safranchik et al, 2020;Lan et al, 2020) to help reduce human labor required for annotation. Weak supervision builds on many previous approaches in machine learning, such as distant supervision (Mintz et al, 2009;Hoffmann et al, 2011;Takamatsu et al, 2012), crowdsourcing (Gao et al, 2011;Krishna et al, 2016), co-training methods (Blum and Mitchell, 1998), pattern-based supervision (Gupta and Manning, 2014), and feature annotation (Mann and McCallum, 2010;Zaidan and Eisner, 2008).…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations