2020
DOI: 10.1101/2020.05.08.083337
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Joint probabilistic modeling of paired transcriptome and proteome measurements in single cells

Abstract: The paired measurement of RNA and surface protein abundance in single cells with CITE-seq is a promising approach to connect transcriptional variation with cell phenotypes and functions. However, each data modality exhibits unique technical biases, making it challenging to conduct a joint analysis and combine these two views into a unified representation of cell state. Here we present Total Variational Inference (totalVI), a framework for the joint probabilistic analysis of paired RNA and protein data from sin… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
14
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 11 publications
(14 citation statements)
references
References 109 publications
(122 reference statements)
0
14
0
Order By: Relevance
“…We modeled x n,m with probability distributions that capture the characteristics of data distributions for each modality. For transcriptome and surface protein data, negative binomial (NB) distribution was selected to explain non-negative counts with overdispersion [14]. In addition, chromatin accessibility data is non-negative count data; however, it exhibits extreme sparsity due to low signal (only two locus exist for each diploid cell), limited coverage, and closed chromatin.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We modeled x n,m with probability distributions that capture the characteristics of data distributions for each modality. For transcriptome and surface protein data, negative binomial (NB) distribution was selected to explain non-negative counts with overdispersion [14]. In addition, chromatin accessibility data is non-negative count data; however, it exhibits extreme sparsity due to low signal (only two locus exist for each diploid cell), limited coverage, and closed chromatin.…”
Section: Resultsmentioning
confidence: 99%
“…One powerful approach to capture nonlinear latent structures is to use expressive variational autoencoders (VAEs), which consist of a pair of neural networks wherein one encodes data into the latent space, and the other decodes them to reconstruct the data distribution [7]. scMVAE and totalVI are the currently available VAE-based methods for single-cell multimodal data analysis [13, 14]. Nevertheless, scMVAE requires a simplified conversion of chromatin accessibility to transcriptome before training, which is known to lead to the nonnegligible loss of epigenetic information [15].…”
Section: Introductionmentioning
confidence: 99%
“…Deep learning methods for multiomic data have also been studied, but these prior works generally did not have access to large-scale paired measurements, which motivated complex techniques to align latent representations ( 15 – 18 ) or constrained these works to bulk measurements ( 19 ). More recently, new experimental techniques for generating paired single-cell data have enabled more streamlined multimodal modeling of protein epitopes and transcriptomics ( 20 ) as well as of physiological profiles and transcriptomics ( 21 ). BABEL builds off these prior works while introducing strategies for more efficient model architectures and latent space learning.…”
mentioning
confidence: 99%
“…scPOST currently operates using a low-dimensional PCA embedding of cells. With multimodal technologies such as CITE-seq 5 becoming more available, analyses of these new data types may include dimensionality reduction with alternative methods, such as canonical correlation analysis (CCA) 10,14,34 and nonlinear embeddings 35,36 . Simulating new types of data in the context of these alternative tools, such as simulating canonical variate coordinates instead of PC coordinates, represents a possible extension of scPOST.…”
Section: Discussionmentioning
confidence: 99%