Single-cell RNA sequencing (scRNA-seq) technologies enable a better understanding of previously unexplored biological diversity. Oftentimes, researchers are specifically interested in modeling the latent structures and variations enriched in one target scRNA-seq dataset as compared to another background dataset generated from sources of variation irrelevant to the task at hand. For example, we may wish to isolate factors of variation only present in measurements from patients with a given disease as opposed to those shared with data from healthy control subjects. Here we introduce Contrastive Variational Inference (contrastiveVI; https://github.com/suinleelab/contrastiveVI), a framework for end-to-end analysis of target scRNA-seq datasets that decomposes the variations into shared and target-specific factors of variation. On three target-background dataset pairs we demonstrate that contrastiveVI learns latent representations that recover known subgroups of target data points better than previous methods and finds differentially expressed genes that agree with known ground truths.
Single-cell RNA sequencing (scRNA-seq) technologies have enabled a greater understanding of previously unexplored biological diversity. Based on the design of such experiments, individual cells from scRNA-seq datasets can often be attributed to non-overlapping "groups". For example, these group labels may denote the cell's tissue or cell line of origin. In this setting, one important problem consists in discerning patterns in the data that are shared across groups versus those that are group-specific. However, existing methods for this type of analysis are mainly limited to (generalized) linear latent variable models. Here we introduce multiGroupVI, a deep generative model for analyzing grouped scRNA-seq datasets that decomposes the data into shared and group-specific factors of variation. We first validate our approach on a simulated dataset, on which we significantly outperform state-of-the-art methods. We then apply it to explore regional differences in an scRNA-seq dataset sampled from multiple regions of the mouse small intestine. We implemented multiGroupVI using the scvi-tools library, and released it as open-source software at https://github.com/Genentech/multiGroupVI.
Deep neural networks (DNNs) capture complex relationships among variables, however, because they require copious samples, their potential has yet to be fully tapped for understanding relationships between gene expression and human phenotypes. Here we introduce an analysis framework, namely MD-AD (Multi-task Deep learning for Alzheimer’s Disease neuropathology), which leverages an unexpected synergy between DNNs and multi-cohort settings. In these settings, true joint analysis can be stymied using conventional statistical methods, which require “harmonized” phenotypes and tend to capture cohort-level variations, obscuring subtler true disease signals. Instead, MD-AD incorporates related phenotypes sparsely measured across cohorts, and learns interactions between genes and phenotypes not discovered using linear models, identifying subtler signals than cohort-level variations which can be uniquely recapitulated in animal models and across tissues. We show that MD-AD exploits sex-specific relationships between microglial immune response and neuropathology, providing a nuanced context for the association between inflammatory genes and Alzheimer’s Disease.
Context.— Large-cell transformation (LCT) of indolent B-cell lymphomas, such as follicular lymphoma (FL) and chronic lymphocytic leukemia (CLL), signals a worse prognosis, at which point aggressive chemotherapy is initiated. Although LCT is relatively straightforward to diagnose in lymph nodes, a marrow biopsy is often obtained first given its ease of procedure, low cost, and low morbidity. However, consensus criteria for LCT in bone marrow have not been established. Objective.— To study the accuracy and reproducibility of a trained convolutional neural network in identifying LCT, in light of promising machine learning tools that may introduce greater objectivity to morphologic analysis. Design.— We retrospectively identified patients who had a diagnosis of FL or CLL who had undergone bone marrow biopsy for the clinical question of LCT. We scored morphologic criteria and correlated results with clinical disease progression. In addition, whole slide scans were annotated into patches to train convolutional neural networks to discriminate between small and large tumor cells and to predict the patient's probability of transformation. Results.— Using morphologic examination, the proportion of large lymphoma cells (≥10% in FL and ≥30% in CLL), chromatin pattern, distinct nucleoli, and proliferation index were significantly correlated with LCT in FL and CLL. Compared to pathologist-derived estimates, machine generated quantification demonstrated better reproducibility and stronger correlation with final outcome data. Conclusions.— These histologic findings may serve as indications of LCT in bone marrow biopsies. The pathologist—augmented with machine system appeared to be the most predictive, arguing for greater efforts to validate and implement these tools to further enhance physician practice.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.