Deep generative models, such as variational autoencoders (VAEs) or deep Boltzmann machines (DBMs), can generate an arbitrary number of synthetic observations after being trained on an initial set of samples. This has mainly been investigated for imaging data but could also be useful for single-cell transcriptomics (scRNA-seq). A small pilot study could be used for planning a full-scale experiment by investigating planned analysis strategies on synthetic data with different sample sizes. It is unclear whether synthetic observations generated based on a small scRNA-seq dataset reflect the properties relevant for subsequent data analysis steps. We specifically investigated two deep generative modeling approaches, VAEs and DBMs. First, we considered single-cell variational inference (scVI) in two variants, generating samples from the posterior distribution, the standard approach, or the prior distribution. Second, we propose single-cell deep Boltzmann machines (scDBMs). When considering the similarity of clustering results on synthetic data to ground-truth clustering, we find that the $$scVI_{posterior}$$ s c V I posterior variant resulted in high variability, most likely due to amplifying artifacts of small datasets. All approaches showed mixed results for cell types with different abundance by overrepresenting highly abundant cell types and missing less abundant cell types. With increasing pilot dataset sizes, the proportions of the cells in each cluster became more similar to that of ground-truth data. We also showed that all approaches learn the univariate distribution of most genes, but problems occurred with bimodality. Across all analyses, in comparing 10$$\times$$ × Genomics and Smart-seq2 technologies, we could show that for 10$$\times$$ × datasets, which have higher sparsity, it is more challenging to make inference from small to larger datasets. Overall, the results show that generative deep learning approaches might be valuable for supporting the design of scRNA-seq experiments.
Introduction: Carotid geometry and wall shear stress (WSS) have been proposed as independent risk factors for the progression of carotid atherosclerosis, but this has not yet been demonstrated in larger longitudinal studies. Therefore, we investigated the impact of these biomarkers on carotid wall thickness in patients with high cardiovascular risk.Methods: Ninety-seven consecutive patients with hypertension, at least one additional cardiovascular risk factor and internal carotid artery (ICA) plaques (wall thickness ≥ 1.5 mm and degree of stenosis ≤ 50%) were prospectively included. They underwent high-resolution 3D multi-contrast and 4D flow MRI at 3 Tesla both at baseline and follow-up. Geometry (ICA/common carotid artery (CCA)-diameter ratio, bifurcation angle, tortuosity and wall thickness) and hemodynamics [WSS, oscillatory shear index (OSI)] of both carotid bifurcations were measured at baseline. Their predictive value for changes of wall thickness 12 months later was calculated using linear regression analysis for the entire study cohort (group 1, 97 patients) and after excluding patients with ICA stenosis ≥10% to rule out relevant inward remodeling (group 2, 61 patients).Results: In group 1, only tortuosity at baseline was independently associated with carotid wall thickness at follow-up (regression coefficient = −0.52, p < 0.001). However, after excluding patients with ICA stenosis ≥10% in group 2, both ICA/CCA-ratio (0.49, p < 0.001), bifurcation angle (0.04, p = 0.001), tortuosity (−0.30, p = 0.040), and WSS (−0.03, p = 0.010) at baseline were independently associated with changes of carotid wall thickness at follow-up.Conclusions: A large ICA bulb and bifurcation angle and low WSS seem to be independent risk factors for the progression of carotid atherosclerosis in the absence of ICA stenosis. By contrast, a high carotid tortuosity seems to be protective both in patients without and with ICA stenosis. These biomarkers may be helpful for the identification of patients who are at particular risk of wall thickness progression and who may benefit from intensified monitoring and treatment.
The extent to which the degeneration of the substantia nigra (SN) and putamen each contribute to motor impairment in Parkinson’s disease (PD) is unclear, as they are usually investigated using different imaging modalities. To examine the pathophysiological significance of the SN and putamen in both motor impairment and the levodopa response in PD using diffusion microstructure imaging (DMI). In this monocentric retrospective cross-sectional study, DMI parameters from 108 patients with PD and 35 healthy controls (HC) were analyzed using a voxel- and region-based approach. Linear models were applied to investigate the association between individual DMI parameters and Movement Disorder Society Unified Parkinson’s Disease Rating Scale-Part 3 performance in ON- and OFF-states, as well as the levodopa response, controlling for age and sex. Voxel- and region-based group comparisons of DMI parameters between PD and HC revealed significant differences in the SN and putamen. In PD, a poorer MDS-UPDRS-III performance in the ON-state was associated with increased free fluid in the SN (b-weight = 65.79, p = 0.004) and putamen (b-weight = 86.00, p = 0.006), and contrariwise with the demise of cells in both structures. The levodopa response was inversely associated with free fluid both in the SN (b-weight = −83.61, p = 0.009) and putamen (b-weight = −176.56, p < 0.001). Interestingly, when the two structures were assessed together, the integrity of the putamen, but not the SN, served as a predictor for the levodopa response (b-weight = −158.03, p < 0.001). Structural alterations in the SN and putamen can be measured by diffusion microstructure imaging in PD. They are associated with poorer motor performance in the ON-state, as well as a reduced response to levodopa. While both nigral and putaminal integrity are required for good performance in the ON-state, it is putaminal integrity alone that determines the levodopa response. Therefore, the structural integrity of the putamen is crucial for the improvement of motor symptoms to dopaminergic medication, and might therefore serve as a promising biomarker for motor staging.
Deep generative models can learn the underlying structure, such as pathways or gene programs, from omics data. We provide an introduction as well as an overview of such techniques, specifically illustrating their use with single-cell gene expression data. For example, the low dimensional latent representations offered by various approaches, such as variational auto-encoders, are useful to get a better understanding of the relations between observed gene expressions and experimental factors or phenotypes. Furthermore, by providing a generative model for the latent and observed variables, deep generative models can generate synthetic observations, which allow us to assess the uncertainty in the learned representations. While deep generative models are useful to learn the structure of high-dimensional omics data by efficiently capturing non-linear dependencies between genes, they are sometimes difficult to interpret due to their neural network building blocks. More precisely, to understand the relationship between learned latent variables and observed variables, e.g., gene transcript abundances and external phenotypes, is difficult. Therefore, we also illustrate current approaches that allow us to infer the relationship between learned latent variables and observed variables as well as external phenotypes. Thereby, we render deep learning approaches more interpretable. In an application with single-cell gene expression data, we demonstrate the utility of the discussed methods.
Recent extensions of single-cell studies to multiple data modalities raise new questions regarding experimental design. For example, the challenge of sparsity in single-omics data might be partly resolved by compensating for missing information across modalities. In particular, deep learning approaches, such as deep generative models (DGMs), can potentially uncover complex patterns via a joint embedding. Yet, this also raises the question of sample size requirements for identifying such patterns from single-cell multi-omics data. Here, we empirically examine the quality of DGM-based integrations for varying sample sizes. We first review the existing literature and give a short overview of deep learning methods for multi-omics integration. Next, we consider eight popular tools in more detail and examine their robustness to different cell numbers, covering two of the most common multi-omics types currently favored. Specifically, we use data featuring simultaneous gene expression measurements at the RNA level and protein abundance measurements for cell surface proteins (CITE-seq), as well as data where chromatin accessibility and RNA expression are measured in thousands of cells (10x Multiome). We examine the ability of the methods to learn joint embeddings based on biological and technical metrics. Finally, we provide recommendations for the design of multi-omics experiments and discuss potential future developments.
Recent extensions of single-cell studies to multiple data modalities raise new questions regarding experimental design. For example, the challenge of sparsity in single-omics data might be partly resolved by compensating for missing information across modalities. In particular, deep learning approaches, such as deep generative models (DGMs), can potentially uncover complex patterns via a joint embedding. Yet, this also raises the question of sample size requirements for identifying such patterns from single-cell multi-omics data. Here, we empirically examine the quality of DGM-based integrations for varying sample sizes. We first review the literature and give a short overview of deep learning methods for multi-omics integration. Next, we consider six popular tools in more detail and examine their robustness to different cell numbers, covering two of the most common multi-omics types currently favored. Specifically, we use data featuring simultaneous gene expression measurements at the RNA level and protein abundance measurements for cell surface proteins (CITE-seq), as well as data where chromatin accessibility and RNA expression are measured in thousands of cells (10x Multiome). We examine the ability of the methods to learn joint embeddings based on biological and technical metrics. Finally, we provide recommendations for the design of multi-omics experiments and discuss potential future developments.
Background: Dopamine transporter SPECT is an established method to investigate nigrostriatal integrity in case of clinically uncertain parkinsonism. Objective: The present study explores whether a data-driven analysis of [123I]FP-CIT SPECT is able to stratify patients according to mortality after SPECT. Methods: Patients from our clinical registry were included if they had received [123I]FP-CIT SPECT between 10/2008 and 06/2016 for diagnosis of parkinsonism and if their vital status could be determined in 07/2017. Specific binding ratios (SBR) of the whole striatum, its asymmetry (asymmetry index, AI; absolute value), and the rostrocaudal gradient of striatal binding (C/pP: caudate SBR divided by posterior putamen SBR) were used as input for hierarchical clustering of patients. We tested differences in survival between these groups (adjusted for age) with a Cox proportional hazards model. Results: Data from 518 patients were analyzed. Median follow-up duration was 3.3 years [95% C.I. 3.1 to 3.7]. Three subgroups identified by hierarchical clustering were characterized by relatively low striatal SBR, high AI, and low C/pP (group 1), low striatal SBR, high AI, and high C/pP (group 2), and high striatal SBR, low AI, and low C/pP (group 3). Mortality was significantly higher in group 1 compared to each of the other two groups (p = 0.029 and p = 0.003, respectively). Conclusion: Data-driven analysis of [123I]FP-CIT SPECT identified a subgroup of patients with significantly increased mortality during follow-up. This suggests that 123I-FP-CIT SPECT might not only serve as a diagnostic tool to verify nigrostriatal degeneration but also provide valuable prognostic information.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.