The increasing availability of large-scale single-cell atlases has enabled the detailed description of cell states. In parallel, advances in deep learning allow rapid analysis of newly generated query datasets by mapping them into reference atlases. However, existing data transformations learned to map query data are not easily explainable using biologically known concepts such as genes or pathways. Here we propose expiMap, a biologically informed deep-learning architecture that enables single-cell reference mapping. ExpiMap learns to map cells into biologically understandable components representing known ‘gene programs’. The activity of each cell for a gene program is learned while simultaneously refining them and learning de novo programs. We show that expiMap compares favourably to existing methods while bringing an additional layer of interpretability to integrative single-cell analysis. Furthermore, we demonstrate its applicability to analyse single-cell perturbation responses in different tissues and species and resolve responses of patients who have coronavirus disease 2019 to different treatments across cell types.
Background Single-cell metabolic studies bring new insights into cellular function, which can often not be captured on other omics layers. Metabolic information has wide applicability, such as for the study of cellular heterogeneity or for the understanding of drug mechanisms and biomarker development. However, metabolic measurements on single-cell level are limited by insufficient scalability and sensitivity, as well as resource intensiveness, and are currently not possible in parallel with measuring transcript state, commonly used to identify cell types. Nevertheless, because omics layers are strongly intertwined, it is possible to make metabolic predictions based on measured data of more easily measurable omics layers together with prior metabolic network knowledge. Scope of Review We summarize the current state of single-cell metabolic measurement and modeling approaches, motivating the use of computational techniques. We review three main classes of computational methods used for prediction of single-cell metabolism: pathway-level analysis, constraint-based modeling, and kinetic modeling. We describe the unique challenges arising when transitioning from bulk to single-cell modeling. Finally, we propose potential model extensions and computational methods that could be leveraged to achieve these goals. Major Conclusions Single-cell metabolic modeling is a rising field that provides a new perspective for understanding cellular functions. The presented modeling approaches vary in terms of input requirements and assumptions, scalability, modeled metabolic layers, and newly gained insights. We believe that the use of prior metabolic knowledge will lead to more robust predictions and will pave the way for mechanistic and interpretable machine-learning models.
Dictyostelium development begins with single-cell starvation and ends with multicellular fruiting bodies. Developmental morphogenesis is accompanied by sweeping transcriptional changes, encompassing nearly half of the 13,000 genes in the genome. We performed time-series RNA-sequencing analyses of the wild type and 20 mutants to explore the relationships between transcription and morphogenesis. These strains show developmental arrest at different stages, accelerated development, or atypical morphologies. Considering eight major morphological transitions, we identified 1371 milestone genes whose expression changes sharply between consecutive transitions. We also identified 1099 genes as members of 21 regulons, which are groups of genes that remain coordinately regulated despite the genetic, temporal, and developmental perturbations. The gene annotations in these groups validate known transitions and reveal new developmental events. For example, DNA replication genes are tightly coregulated with cell division genes, so they are expressed in mid-development although chromosomal DNA is not replicated. Our data set includes 486 transcriptional profiles that can help identify new relationships between transcription and development and improve gene annotations. We show its utility by showing that cycles of aggregation and disaggregation in allorecognition-defective mutants involve dedifferentiation. We also show sensitivity to genetic and developmental conditions in two commonly used actin genes, act6 and act15, and robustness of the coaA gene. Finally, we propose that gpdA is a better mRNA quantitation standard because it is less sensitive to external conditions than commonly used standards. The data set is available for democratized exploration through the web application dictyExpress and the data mining environment Orange.
Multiple single-cell RNA sequencing (scRNA-seq) datasets have been generated to study pancreatic islet cells during development, homeostasis, and diabetes progression. However, despite the time and resources invested into the past scRNA-seq studies, there is still no consensus on islet cell states and associated pathways in health and dysfunction as well as the value of frequently used preclinical mouse diabetes models. Since these challenges can be only resolved with a joint analysis of multiple datasets, we present a scRNA-seq cross-condition mouse islet atlas (MIA). We integrated over 300,000 cells from nine datasets with 56 samples, varying in age, sex, and diabetes models, including autoimmune type 1 diabetes (T1D) model (NOD), gluco-/lipotoxicity T2D model (db/db), and chemical streptozotocin (STZ) β-cell ablation model. MIA is a curated resource that enables interactive exploration of gene expression and transfer of cell types and states. We use MIA to obtain new insights into islet cells in health and disease that cannot be reached from individual datasets. Based on the MIA β-cell landscape we report cross-publication differences between previously suggested marker genes of individual phenotypes. We further show that in the STZ model β-cells transcriptionally correlate to human T2D and mouse db/db model β-cells, but are less similar to human T1D and mouse NOD model β-cells. We define new cell states involved in disease progression across diabetes models. We also observe different pathways shared between immature, aged, and diabetes model β-cell states. In conclusion, our work presents the first comprehensive analysis of β-cell responses to different stressors, providing a roadmap for the understanding of β-cell plasticity, compensation, and demise.
miRNA regulome is whole set of regulatory elements that regulate miRNA expression or are under control of miRNAs. Its understanding is vital for comprehension of miRNA functions. Classification of miRNA-related genetic variability is challenging because miRNA interact with different genomic elements and are studied at different omics levels. In the present study, miRNA-associated genetic variability is presented at three levels: miRNA genes and their upstream regulation, miRNA silencing machinery and miRNA targets. Several types of miRNA-associated genetic variations are known, including short and structural polymorphisms and epimutations. Differential expression can also affect miRNA regulome function. Classification of miRNA-associated genetic variability presents a baseline for complementing sequence variant nomenclature, planning of experiments, protocols for multi-omics data integration and development of biomarkers.
The increasing availability of large-scale single-cell datasets has enabled the detailed description of cell states across multiple biological conditions and perturbations. In parallel, recent advances in unsupervised machine learning, particularly in transfer learning, have enabled fast and scalable mapping of these new single-cell datasets onto reference atlases. The resulting large-scale machine learning models however often have millions of parameters, rendering interpretation of the newly mapped datasets challenging. Here, we propose expiMap, a deep learning model that enables interpretable reference mapping using biologically understandable entities, such as curated sets of genes and gene programs. The key concept is the substitution of the uninterpretable nodes in an autoencoder's bottleneck by labeled nodes mapping to interpretable lists of genes, such as gene ontologies, biological pathways, or curated gene sets, for which activities are learned as constraints during reconstruction. This is enabled by the incorporation of predefined gene programs into the reference model, and at the same time allowing the model to learn de novo new programs and refine existing programs durin reference mapping. We show that the model retains similar integration performance as existing methods while providing a biologically interpretable framework for understanding cellular behavior. We demonstrate the capabilities of expiMap by applying it to 15 datasets encompassing five different tissues and species. The interpretable nature of the mapping revealed unreported associations between interferon signaling via the RIG-I/MDA5 and GPCRs pathways, with differential behavior in CD8+ T cells and CD14+ monocytes in severe COVID-19, as well as the role of annexins in the cellular communications between lymphoid and myeloid compartments for explaining patient response to the applied drugs. Finally, expiMap enabled the direct comparison of a diverse set of pancreatic beta cells from multiple studies where we observed a strong, previously unreported correlation between the unfolded protein response and asparagine N-linked glycosylation. Altogether, expiMap enables the interpretable mapping of single cell transcriptome data sets across cohorts, disease states and other perturbations.
Erstwhile, sex was determined by observation, which is not always feasible. Nowadays, genetic methods are prevailing due to their accuracy, simplicity, low costs, and time‐efficiency. However, there is no comprehensive review enabling overview and development of the field. The studies are heterogeneous, lacking a standardized reporting strategy. Therefore, our aim was to collect genetic sexing assays for mammals and assemble them in a catalogue with unified terminology. Publications were extracted from online databases using key words such as sexing and molecular. The collected data were supplemented with species and gene IDs and the type of sex‐specific sequence variant (SSSV). We developed a catalogue and graphic presentation of diagnostic tests for molecular sex determination of mammals, based on 58 papers published from 2/1991 to 10/2016. The catalogue consists of five categories: species, genes, SSSVs, methods, and references. Based on the analysis of published literature, we propose minimal requirements for reporting, consisting of: species scientific name and ID, genetic sequence with name and ID, SSSV, methodology, genomic coordinates (e.g., restriction sites, SSSVs), amplification system, and description of detected amplicon and controls. The present study summarizes vast knowledge that has up to now been scattered across databases, representing the first step toward standardization regarding molecular sexing, enabling a better overview of existing tests and facilitating planned designs of novel tests. The project is ongoing; collecting additional publications, optimizing field development, and standardizing data presentation are needed.
Schizophrenia (SZ) onset and treatment outcome have important genetic components, however individual genes do not have strong effects on SZ phenotype. Therefore, it is important to use the pathway-based approach and study metabolic and signaling pathways, such as dopaminergic and serotonergic. Serotonin pathway has an important role in brain signaling, nevertheless, its role in SZ is not as thoroughly examined as that of dopamine pathway. In this study, we reviewed serotonin pathway genes and genetic variations associated with SZ, including variations at DNA, RNA, and epigenetic level.We obtained 30 serotonin pathway genes from Kyoto encyclopedia of genes and genomes and used these genes for the literature review. We extracted 20 protein coding serotonin pathway genes with genetic variations associated with SZ onset, development, and treatment from 31 research papers. Genes associated with SZ are present on all levels of serotonin pathway: serotonin synthesis, transport, receptor binding, intracellular signaling, and reuptake; however, regulatory genes are poorly researched. We summarized common challenges of genetic association studies and presented some solutions. The analysis of reported serotonin pathway-SZ associations revealed lack of information about certain serotonin pathway genes potentially associated with SZ. Furthermore, it is becoming clear that interactions among serotonin pathway genes and their regulators may bring further knowledge about their involvement in SZ. K E Y W O R D Sgenetic association studies, genetic variations, pharamcogenetics, schizophrenia, serotonin pathway
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.