BackgroundGene set scoring provides a useful approach for quantifying concordance between sample transcriptomes and selected molecular signatures. Most methods use information from all samples to score an individual sample, leading to unstable scores in small data sets and introducing biases from sample composition (e.g. varying numbers of samples for different cancer subtypes). To address these issues, we have developed a truly single sample scoring method, and associated R/Bioconductor package singscore (https://bioconductor.org/packages/singscore).ResultsWe use multiple cancer data sets to compare singscore against widely-used methods, including GSVA, z-score, PLAGE, and ssGSEA. Our approach does not depend upon background samples and scores are thus stable regardless of the composition and number of samples being scored. In contrast, scores obtained by GSVA, z-score, PLAGE and ssGSEA can be unstable when less data are available (NS < 25). The singscore method performs as well as the best performing methods in terms of power, recall, false positive rate and computational time, and provides consistently high and balanced performance across all these criteria. To enhance the impact and utility of our method, we have also included a set of functions implementing visual analysis and diagnostics to support the exploration of molecular phenotypes in single samples and across populations of data.ConclusionsThe singscore method described here functions independent of sample composition in gene expression data and thus it provides stable scores, which are particularly useful for small data sets or data integration. Singscore performs well across all performance criteria, and includes a suite of powerful visualization functions to assist in the interpretation of results. This method performs as well as or better than other scoring approaches in terms of its power to distinguish samples with distinct biology and its ability to call true differential gene sets between two conditions. These scores can be used for dimensional reduction of transcriptomic data and the phenotypic landscapes obtained by scoring samples against multiple molecular signatures may provide insights for sample stratification.Electronic supplementary materialThe online version of this article (10.1186/s12859-018-2435-4) contains supplementary material, which is available to authorized users.
Key findings • A novel approach for integrating DNA-seq and single-cell RNA-seq data to reconstruct clonal substructure for single-cell transcriptomes. • Evidence for non-neutral evolution of clonal populations in human fibroblasts. • Proliferation and cell cycle pathways are commonly distorted in mutated clonal populations.
Objectives Lipedema, a poorly understood chronic disease of adipose hyper-deposition, is often mistaken for obesity and causes significant impairment to mobility and quality-of-life. To identify molecular mechanisms underpinning lipedema, we employed comprehensive omics-based comparative analyses of whole tissue, adipocyte precursors (adipose-derived stem cells (ADSCs)), and adipocytes from patients with or without lipedema. Methods We compared whole-tissues, ADSCs, and adipocytes from body mass index–matched lipedema (n = 14) and unaffected (n = 10) patients using comprehensive global lipidomic and metabolomic analyses, transcriptional profiling, and functional assays. Results Transcriptional profiling revealed >4400 significant differences in lipedema tissue, with altered levels of mRNAs involved in critical signaling and cell function-regulating pathways (e.g., lipid metabolism and cell-cycle/proliferation). Functional assays showed accelerated ADSC proliferation and differentiation in lipedema. Profiling lipedema adipocytes revealed >900 changes in lipid composition and >600 differentially altered metabolites. Transcriptional profiling of lipedema ADSCs and non-lipedema ADSCs revealed significant differential expression of >3400 genes including some involved in extracellular matrix and cell-cycle/proliferation signaling pathways. One upregulated gene in lipedema ADSCs, Bub1, encodes a cell-cycle regulator, central to the kinetochore complex, which regulates several histone proteins involved in cell proliferation. Downstream signaling analysis of lipedema ADSCs demonstrated enhanced activation of histone H2A, a key cell proliferation driver and Bub1 target. Critically, hyperproliferation exhibited by lipedema ADSCs was inhibited by the small molecule Bub1 inhibitor 2OH-BNPP1 and by CRISPR/Cas9-mediated Bub1 gene depletion. Conclusion We found significant differences in gene expression, and lipid and metabolite profiles, in tissue, ADSCs, and adipocytes from lipedema patients compared to non-affected controls. Functional assays demonstrated that dysregulated Bub1 signaling drives increased proliferation of lipedema ADSCs, suggesting a potential mechanism for enhanced adipogenesis in lipedema. Importantly, our characterization of signaling networks driving lipedema identifies potential molecular targets, including Bub1, for novel lipedema therapeutics.
Genetic maps have been fundamental to building our understanding of disease genetics and evolutionary processes. The gametes of an individual contain all of the information required to perform a de novo chromosome-scale assembly of an individual’s genome, which historically has been performed with populations and pedigrees. Here, we discuss how single-cell gamete sequencing offers the potential to merge the advantages of short-read sequencing with the ability to build personalized genetic maps and open up an entirely new space in personalized genetics.
Advances in RNA sequencing (RNA-seq) technologies that measure the transcriptome of biological samples have revolutionised our ability to understand transcriptional regulatory programs that underpin diseases such as cancer. We recently published singscore - a single sample, rank-based gene set scoring method which quantifies how concordant the transcriptional profile of individual samples are relative to specific gene sets of interest. Here we demonstrate the application of singscore to investigate transcriptional profiles associated with specific mutations or genetic lesions in acute myeloid leukemia. Using matched genomic and transcriptomic data available through the TCGA we show that scoring of appropriate signatures can distinguish samples with corresponding mutations, reflecting the ability of these mutations to drive aberrant transcriptional programs involved in leukemogenesis. We believe the singscore method is particularly useful for studying heterogeneity within a specific subsets of cancers, and as demonstrated, we show the ability of singscore to identify where alternative mutations appear to drive similar transcriptional programs.
We present a case of an obese 22-year-old man with activating GCK variant who had neonatal hypoglycemia, re-emerging with hypoglycemia later in life. We investigated him for asymptomatic hypoglycemia with a family history of hypoglycemia. Genetic testing yielded a novel GCK missense class 3 variant that was subsequently found in his mother, sister and nephew and reclassified as a class 4 likely pathogenic variant. Glucokinase enables phosphorylation of glucose, the rate-limiting step of glycolysis in the liver and pancreatic β cells. It plays a crucial role in the regulation of insulin secretion. Inactivating variants in GCK cause hyperglycemia and activating variants cause hypoglycemia. Spleen-preserving distal pancreatectomy revealed diffuse hyperplastic islets, nuclear pleomorphism and periductular islets. Glucose stimulated insulin secretion revealed increased insulin secretion in response to glucose. Cytoplasmic calcium, which triggers exocytosis of insulin-containing granules, revealed normal basal but increased glucose-stimulated level. Unbiased gene expression analysis using 10X single cell sequencing revealed upregulated INS and CKB genes and downregulated DLK1 and NPY genes in β-cells. Further studies are required to see if alteration in expression of these genes plays a role in the metabolic and histological phenotype associated with glucokinase pathogenic variant. There were more large islets in the patient’s pancreas than in control subjects but there was no difference in the proportion of β cells in the islets. His hypoglycemia was persistent after pancreatectomy, was refractory to diazoxide and improved with pasireotide. This case highlights the variable phenotype of GCK mutations. In-depth molecular analyses in the islets have revealed possible mechanisms for hyperplastic islets and insulin hypersecretion.
Profiling gametes of an individual enables the construction of personalised haplotypes and meiotic crossover landscapes, now achievable at larger scale than ever through the availability of high-throughput single-cell sequencing technologies. However, haplotyping single gametes from high-throughput single-cell DNA sequencing datasets using existing methods requires intensive processing. Here we introduce an efficient software toolset using modern programming languages for the common tasks of haplotyping haploid gamete genomes and calling crossovers (sgcocaller), and constructing and visualising individualised crossover landscapes (comapr) from single gametes. With additional data pre-possessing, the tools can also be applied to bulk sequenced samples.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.