2022
DOI: 10.1007/978-3-030-97546-3_59
|View full text |Cite
|
Sign up to set email alerts
|

SimSCL: A Simple Fully-Supervised Contrastive Learning Framework for Text Representation

Abstract: HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des labor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 19 publications
(19 reference statements)
0
1
0
Order By: Relevance
“…Following common practice in contrastive learning, we first study the importance of adding a projection head that maps representations to new space where the supervised contrastive loss is applied. Similar to [9], [42], we tested three different MLP architectures: (1) identity mapping; (2) linear projection z = g(h) = W (1) h ∈ R 512 ; (3) non-linear projection with one additional hidden layer as used by several previous approaches z = g(h) = W (2) ReLU (W (1) h) ∈ R 512 . Similar to what was found in previous works, we observe that a non-linear architecture is better than both the linear and the identity functions for the projection head network (See table 2).…”
Section: Classification Accuracymentioning
confidence: 99%
“…Following common practice in contrastive learning, we first study the importance of adding a projection head that maps representations to new space where the supervised contrastive loss is applied. Similar to [9], [42], we tested three different MLP architectures: (1) identity mapping; (2) linear projection z = g(h) = W (1) h ∈ R 512 ; (3) non-linear projection with one additional hidden layer as used by several previous approaches z = g(h) = W (2) ReLU (W (1) h) ∈ R 512 . Similar to what was found in previous works, we observe that a non-linear architecture is better than both the linear and the identity functions for the projection head network (See table 2).…”
Section: Classification Accuracymentioning
confidence: 99%