2021
DOI: 10.1101/2021.07.26.453730
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Self-supervised contrastive learning for integrative single cell RNA-seq data analysis

Abstract: Single-cell RNA-sequencing (scRNA-seq) has become a powerful tool to reveal the complex biological diversity and heterogeneity among cell populations. However, the technical noise and bias of the technology still have negative impacts on the downstream analysis. Here, we present a self-supervised Contrastive LEArning framework for scRNA-seq (CLEAR) profile representation and the downstream analysis. CLEAR overcomes the heterogeneity of the experimental data with a specifically designed representation learning … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
8
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
4
1

Relationship

3
7

Authors

Journals

citations
Cited by 13 publications
(8 citation statements)
references
References 61 publications
(95 reference statements)
0
8
0
Order By: Relevance
“…Before feeding the data to the deep neural network, we first apply various data augmentation techniques to the training data sets [35, 36], including flipping, centered cropping, and adding random noise. During every epoch in the training process, the samples are flipped along their main diagonals with a probability of 0.5, while the original data set remains unmodified.…”
Section: Methodsmentioning
confidence: 99%
“…Before feeding the data to the deep neural network, we first apply various data augmentation techniques to the training data sets [35, 36], including flipping, centered cropping, and adding random noise. During every epoch in the training process, the samples are flipped along their main diagonals with a probability of 0.5, while the original data set remains unmodified.…”
Section: Methodsmentioning
confidence: 99%
“…The cell type labels could refer to biological cell types or the labels of data batches collected from different times or platforms. Following the idea of contrastive learning (31, 32), we employ a contrastive loss in embedding space. It enforces smaller In-Batch distance and larger Between-Batch distance.…”
Section: Methodsmentioning
confidence: 99%
“…Typically, cell type annotation involves two steps: (1) clustering the cells into different subgroups and (2) labeling each group a specific type manually based on the prior-known marker genes. A number of unsupervised machine learning algorithms have been developed, including classical machine learning based methods such as Seurat 7 and Scanpy 8 , and newly published deep learning based methods, such as scDHA 9 and CLEAR 10 . However, these methods can be time-consuming and burdensome.…”
Section: Introductionmentioning
confidence: 99%