Supporting Clustering with Contrastive Learning

Zhang, Dejiao; Feng, Nan; Wei, Xiaokai; Li, Shang-Wen; Zhu, Henghui; McKeown, Kathleen; Nallapati, Ramesh; Arnold, Andrew O.; Xiang, Bing

doi:10.18653/v1/2021.naacl-main.427

Cited by 86 publications

(53 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, generators are computationally expensive for end-to-end training, and often less effective than the discriminative models [15,27,33] in feature learning [11]. Recent research has considered contrastive learning in clustering [28,52,70,85]. We discuss the drawback of them and their relations to TCC in Sec.…”

Section: Related Workmentioning

confidence: 99%

“…This has further motivated the development of a two-stage clustering pipeline [70] with contrastive pre-training and k-means [55]. An alternative simple migration [85] yields a composition of an InfoNCE loss [61] and a clustering one [77]. Compared with the deep generative counterparts [16,36,53,82], contrastive clustering is free from decoding and computationally practical, with guaranteed feature quality.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

You Never Cluster Alone

Shen¹,

Shen²,

Wang³

et al. 2021

Preprint

View full text Add to dashboard Cite

Recent advances in self-supervised learning with instance-level contrastive objectives facilitate unsupervised clustering. However, a standalone datum is not perceiving the context of the holistic cluster, and may undergo sub-optimal assignment. In this paper, we extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation that encodes the context of each data group. Contrastive learning with this representation then rewards the assignment of each datum. To implement this vision, we propose twin-contrast clustering (TCC). We define a set of categorical variables as clustering assignment confidence, which links the instancelevel learning track with the cluster-level one. On one hand, with the corresponding assignment variables being the weight, a weighted aggregation along the data points implements the set representation of a cluster. We further propose heuristic cluster augmentation equivalents to enable cluster-level contrastive learning. On the other hand, we derive the evidence lower-bound of the instance-level contrastive objective with the assignments. By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps. Extensive experiments show that TCC outperforms the state-of-the-art on challenging benchmarks.Preprint. Under review.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

You Never Cluster Alone

Shen¹,

Shen²,

Wang³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…However, this requires data augmentation to create positive example pairs. For text, some augmentations use back-translation (Cao and Wang, 2021;Zhang et al, 2021b). Taking inspiration from these clustering and representation learning techniques, we employ back-translation as data augmentation to create more positive pairs, improving the learning of attention weights between event mentions.…”

Section: Back-translationmentioning

confidence: 99%

“…Several new unsupervised deep clustering approaches use contrastive loss for clustering images Zhong et al, 2020) and text (Zhang et al, 2021b). These methods require data augmentation to create positive example pairs.…”

Section: Related Workmentioning

confidence: 99%

Semi-supervised New Event Type Induction and Description via Contrastive Loss-Enforced Batch Attention

Edwards¹,

Ji²

2022

Preprint

View full text Add to dashboard Cite

Most event extraction methods have traditionally relied on an annotated set of event types. However, creating event ontologies and annotating supervised training data are expensive and time-consuming. Previous work has proposed semi-supervised approaches which leverage seen (annotated) types to learn how to automatically discover new event types. Stateof-the-art methods, both semi-supervised or fully unsupervised, use a form of reconstruction loss on specific tokens in a context. In contrast, we present a novel approach to semisupervised new event type induction using a masked contrastive loss, which learns similarities between event mentions by enforcing an attention mechanism over the data minibatch. We further disentangle the discovered clusters by approximating the underlying manifolds in the data, which allows us to increase normalized mutual information and Fowlkes-Mallows scores by over 20% absolute. Building on these clustering results, we extend our approach to two new tasks: predicting the type name of the discovered clusters and linking them to FrameNet frames. 1

show abstract

“…We name our approach Pairwise Supervised Contrastive Learning (PairSupCon). As noticed by the recent work (Wu et al, 2018;Zhang et al, 2021), instance discrimination learning can implicitly group similar instances together in the representation space without any explicit learning force directs to do so. PairSupCon leverages this implicit grouping effect to bring together representations from the same semantic category while, simultaneously enhancing the semantic entailment and contradiction reasoning capability of the model.…”

Section: Introductionmentioning

confidence: 99%

Pairwise Supervised Contrastive Learning of Sentence Representations

Zhang

Xiao

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Many recent successes in sentence representation learning have been achieved by simply fine-tuning on the Natural Language Inference (NLI) datasets with triplet loss or siamese loss. Nevertheless, they share a common weakness: sentences in a contradiction pair are not necessarily from different semantic categories. Therefore, optimizing the semantic entailment and contradiction reasoning objective alone is inadequate to capture the high-level semantic structure. The drawback is compounded by the fact that the vanilla siamese or triplet losses only learn from individual sentence pairs or triplets, which often suffer from bad local optima. In this paper, we propose PairSupCon, an instance discrimination based approach aiming to bridge semantic entailment and contradiction understanding with high-level categorical concept encoding. We evaluate PairSupCon on various downstream tasks that involve understanding sentence semantics at different granularities. We outperform the previous state-of-theart method with 10%-13% averaged improvement on eight clustering tasks, and 5%-6% averaged improvement on seven semantic textual similarity (STS) tasks.

show abstract

Supporting Clustering with Contrastive Learning

Cited by 86 publications

References 35 publications

You Never Cluster Alone

You Never Cluster Alone

Semi-supervised New Event Type Induction and Description via Contrastive Loss-Enforced Batch Attention

Pairwise Supervised Contrastive Learning of Sentence Representations

Contact Info

Product

Resources

About