2021
DOI: 10.48550/arxiv.2105.11741
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Abstract: Learning high-quality sentence representations benefits a wide range of natural language processing tasks. Though BERT-based pretrained language models achieve high performance on many downstream tasks, the native derived sentence representations are proved to be collapsed and thus produce a poor performance on the semantic textual similarity (STS) tasks. In this paper, we present ConSERT, a Contrastive Framework for Self-Supervised SEntence Representation Transfer, that adopts contrastive learning to fine-tun… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
69
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 50 publications
(79 citation statements)
references
References 27 publications
0
69
0
Order By: Relevance
“…Nevertheless, we focus on unsupervised contrastive learning and form the positive pairs via data augmentation, since such methods are more cost-effective and applicable across different domains and languages. Along this line, many approaches have been developed recently, where the augmentations are obtained via sampling from surrounding or nearby contexts (Logeswaran and Lee, 2018;Giorgi et al, 2020), word or feature-level perturbation Yan et al, 2021), back-translation (Fang and Xie, 2020), sentencelevel corruption using an auxiliary language model (Meng et al, 2021), intermediate representations of BERT (Kim et al, 2021), and dropout (Yan et al, 2021;Gao et al, 2021).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Nevertheless, we focus on unsupervised contrastive learning and form the positive pairs via data augmentation, since such methods are more cost-effective and applicable across different domains and languages. Along this line, many approaches have been developed recently, where the augmentations are obtained via sampling from surrounding or nearby contexts (Logeswaran and Lee, 2018;Giorgi et al, 2020), word or feature-level perturbation Yan et al, 2021), back-translation (Fang and Xie, 2020), sentencelevel corruption using an auxiliary language model (Meng et al, 2021), intermediate representations of BERT (Kim et al, 2021), and dropout (Yan et al, 2021;Gao et al, 2021).…”
Section: Related Workmentioning
confidence: 99%
“…Various contrastive learning based approaches have been proposed for learning sentence representations, with the main difference lies in how the augmentations are generated (Fang and Xie, 2020;Giorgi et al, 2020;Meng et al, 2021;Yan et al, 2021;Kim et al, 2021;Gao et al, 2021). Somewhat surprisingly, a recent work (Gao et al, 2021) empirically shows that augmentations obtained by Dropout (Srivastava et al, 2014), i.e., feeding the same instance to the encoder twice, outperforms common text augmentation operations on the text directly, including cropping, word deletion, or synonym replacement.…”
Section: Introductionmentioning
confidence: 99%
“…Contrastive learning (CL) recently is attracting researchers' attention in all area. After witnessing its superiority in Computer Vision tasks (Chen et al 2020;He et al 2020), researchers in NLP are also applying this techniques (Wu et al 2020;Karpukhin et al 2020;Yan et al 2021;Giorgi et al 2021;Gao, Yao, and Chen 2021). For the concern of ODPR, the research lines of CL can be divided into two types: (i) Improving the sampling strategies for positive samples and hard negative samples.…”
Section: Contrastive Learning In Nlpmentioning
confidence: 99%
“…To promote the robustness, some data augmentations are used in training. Following [5], we adopt token shuffling, cutoff and dropout. Token shuffling strategy aims to randomly shuffle order of the tokens in the token embeddings.…”
Section: Data Augmentationmentioning
confidence: 99%