BackgroundGene set scoring provides a useful approach for quantifying concordance between sample transcriptomes and selected molecular signatures. Most methods use information from all samples to score an individual sample, leading to unstable scores in small data sets and introducing biases from sample composition (e.g. varying numbers of samples for different cancer subtypes). To address these issues, we have developed a truly single sample scoring method, and associated R/Bioconductor package singscore (https://bioconductor.org/packages/singscore).ResultsWe use multiple cancer data sets to compare singscore against widely-used methods, including GSVA, z-score, PLAGE, and ssGSEA. Our approach does not depend upon background samples and scores are thus stable regardless of the composition and number of samples being scored. In contrast, scores obtained by GSVA, z-score, PLAGE and ssGSEA can be unstable when less data are available (NS < 25). The singscore method performs as well as the best performing methods in terms of power, recall, false positive rate and computational time, and provides consistently high and balanced performance across all these criteria. To enhance the impact and utility of our method, we have also included a set of functions implementing visual analysis and diagnostics to support the exploration of molecular phenotypes in single samples and across populations of data.ConclusionsThe singscore method described here functions independent of sample composition in gene expression data and thus it provides stable scores, which are particularly useful for small data sets or data integration. Singscore performs well across all performance criteria, and includes a suite of powerful visualization functions to assist in the interpretation of results. This method performs as well as or better than other scoring approaches in terms of its power to distinguish samples with distinct biology and its ability to call true differential gene sets between two conditions. These scores can be used for dimensional reduction of transcriptomic data and the phenotypic landscapes obtained by scoring samples against multiple molecular signatures may provide insights for sample stratification.Electronic supplementary materialThe online version of this article (10.1186/s12859-018-2435-4) contains supplementary material, which is available to authorized users.
Natural killer (NK) cell activity is essential for initiating antitumor responses and may be linked to immunotherapy success. NK cells and other innate immune components could be exploitable for cancer treatment, which drives the need for tools and methods that identify therapeutic avenues. Here, we extend our gene-set scoring method singscore to investigate NK cell infiltration by applying RNA-seq analysis to samples from bulk tumors. Computational methods have been developed for the deconvolution of immune cell types within solid tumors. We have taken the NK cell gene signatures from several such tools, then curated the gene list using a comparative analysis of tumors and immune cell types. Using a gene-set scoring method to investigate RNA-seq data from The Cancer Genome Atlas (TCGA), we show that patients with metastatic cutaneous melanoma have an improved survival rate if their tumor shows evidence of NK cell infiltration. Furthermore, these survival effects are enhanced in tumors that show higher expression of genes that encode NK cell stimuli such as the cytokine IL15. Using this signature, we then examine transcriptomic data to identify tumor and stromal components that may influence the penetrance of NK cells into solid tumors. Our results provide evidence that NK cells play a role in the regulation of human tumors and highlight potential survival effects associated with increased NK cell activity. Our computational analysis identifies putative gene targets that may be of therapeutic value for boosting NK cell antitumor immunity.
The bond graph approach to modelling biochemical networks is extended to allow hierarchical construction of complex models from simpler components. This is made possible by representing the simpler components as thermodynamically open systems exchanging mass and energy via ports. A key feature of this approach is that the resultant models are robustly thermodynamically compliant: the thermodynamic compliance is not dependent on precise numerical values of parameters. Moreover, the models are reusable due to the well-defined interface provided by the energy ports.To extract bond graph model parameters from parameters found in the literature, general and compact formulae are developed to relate free-energy constants and equilibrium constants. The existence and uniqueness of solutions is considered in terms of fundamental properties of stoichiometric matrices.The approach is illustrated by building a hierarchical bond graph model of glycogenolysis in skeletal muscle.
MicroRNAs (miRNAs) are important post-transcriptional regulators of gene expression, functioning in part by facilitating the degradation of target mRNAs. They have an established role in controlling epithelial-mesenchymal transition (EMT), a reversible phenotypic program underlying normal and pathological processes. Many studies demonstrate the role of individual miRNAs using overexpression at levels greatly exceeding physiological abundance. This can influence transcripts with relatively poor targeting and may in part explain why over 130 different miRNAs are directly implicated as EMT regulators. Analyzing a human mammary cell model of EMT we found evidence that a set of miRNAs, including the miR-200 and miR-182/183 family members, co-operate in post-transcriptional regulation, both reinforcing and buffering transcriptional output. Investigating this, we demonstrate that combinatorial treatment altered cellular phenotype with miRNA concentrations much closer to endogenous levels and with less off-target effects. This suggests that co-operative targeting by miRNAs is important for their physiological function and future work classifying miRNAs should consider such combinatorial effects.
Most cancer deaths are due to metastasis, and epithelial-to-mesenchymal transition (EMT) plays a central role in driving cancer cell metastasis. EMT is induced by different stimuli, leading to different signaling patterns and therapeutic responses. TGFβ is one of the best-studied drivers of EMT, and many drugs are available to target this signaling pathway. A comprehensive bioinformatics approach was employed to derive a signature for TGFβ-induced EMT which can be used to score TGFβ-driven EMT in cells and clinical specimens. Considering this signature in pan-cancer cell and tumor datasets, a number of cell lines (including basal B breast cancer and cancers of the central nervous system) show evidence for TGFβ-driven EMT and carry a low mutational burden across the TGFβ signaling pathway. Furthermore, significant variation is observed in the response of high scoring cell lines to some common cancer drugs. Finally, this signature was applied to pan-cancer data from The Cancer Genome Atlas to identify tumor types with evidence of TGFβ-induced EMT. Tumor types with high scores showed significantly lower survival rates than those with low scores and also carry a lower mutational burden in the TGFβ pathway. The current transcriptomic signature demonstrates reproducible results across independent cell line and cancer datasets and identifies samples with strong mesenchymal phenotypes likely to be driven by TGFβ. The TGFβ-induced EMT signature may be useful to identify patients with mesenchymal-like tumors who could benefit from targeted therapeutics to inhibit promesenchymal TGFβ signaling and disrupt the metastatic cascade. .
BackgroundElucidation of regulatory networks, including identification of regulatory mechanisms specific to a given biological context, is a key aim in systems biology. This has motivated the move from co-expression to differential co-expression analysis and numerous methods have been developed subsequently to address this task; however, evaluation of methods and interpretation of the resulting networks has been hindered by the lack of known context-specific regulatory interactions.ResultsIn this study, we develop a simulator based on dynamical systems modelling capable of simulating differential co-expression patterns. With the simulator and an evaluation framework, we benchmark and characterise the performance of inference methods. Defining three different levels of “true” networks for each simulation, we show that accurate inference of causation is difficult for all methods, compared to inference of associations. We show that a z-score-based method has the best general performance. Further, analysis of simulation parameters reveals five network and simulation properties that explained the performance of methods. The evaluation framework and inference methods used in this study are available in the dcanr R/Bioconductor package.ConclusionsOur analysis of networks inferred from simulated data show that hub nodes are more likely to be differentially regulated targets than transcription factors. Based on this observation, we propose an interpretation of the inferred differential network that can reconstruct a putative causal network.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.