miCSE: Mutual Information Contrastive Learning for Low-shot Sentence Embeddings

Klein, Tassilo; Nabi, Moin

doi:10.48550/arxiv.2211.04928

Cited by 1 publication

(3 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Refs. [4,[6][7][8][9][10][11][12][13][14][15] use contrastive learning. BERT-Flow and BERT-whitening [5,55] are post-processing models that apply flow-network and whitening to enhance BERT, respectively.…”

Section: Baseline and Previous Modelsmentioning

confidence: 99%

“…The best average result is in bold in the last column. †: [33], ‡: [9], ♠: [6], ♣: [10], : [56], : [4], ♥: [7], ♦: [8], : [13] , : [14], * : [15], SBERT-base-nli-v2: reproduced by ourselves, and the rest of the results are taken from Ref. [12].…”

Section: Reproducing Sbert-base-nli-v2 Modelmentioning

confidence: 99%

“…To overcome this problem, a thread of research has thus been working on applying contrastive learning techniques under the pre-train-then-fine-tune paradigm of BERT [4,[6][7][8][9][10][11][12][13][14]. Contrastive learning pulls similar samples closer together, but pushes out dissimilar samples [15].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

SelfCCL: Curriculum Contrastive Learning by Transferring Self-Taught Knowledge for Fine-Tuning BERT

Dehghan

Amasyalı

2023

Applied Sciences

View full text Add to dashboard Cite

BERT, the most popular deep learning language model, has yielded breakthrough results in various NLP tasks. However, the semantic representation space learned by BERT has the property of anisotropy. Therefore, BERT needs to be fine-tuned for certain downstream tasks such as Semantic Textual Similarity (STS). To overcome this problem and improve the sentence representation space, some contrastive learning methods have been proposed for fine-tuning BERT. However, existing contrastive learning models do not consider the importance of input triplets in terms of easy and hard negatives during training. In this paper, we propose the SelfCCL: Curriculum Contrastive Learning model by Transferring Self-taught Knowledge for Fine-Tuning BERT, which mimics the two ways that humans learn about the world around them, namely contrastive learning and curriculum learning. The former learns by contrasting similar and dissimilar samples. The latter is inspired by the way humans learn from the simplest concepts to the most complex concepts. Our model also performs this training by transferring self-taught knowledge. That is, the model figures out which triplets are easy or difficult based on previously learned knowledge, and then learns based on those triplets in the order of curriculum using a contrastive objective. We apply our proposed model to the BERT and Sentence BERT(SBERT) frameworks. The evaluation results of SelfCCL on the standard STS and SentEval transfer learning tasks show that using curriculum learning together with contrastive learning increases average performance to some extent.

show abstract