BackgroundMethods for the integrative analysis of multi-omics data are required to draw a more complete and accurate picture of the dynamics of molecular systems. The complexity of biological systems, the technological limits, the large number of biological variables and the relatively low number of biological samples make the analysis of multi-omics datasets a non-trivial problem.Results and ConclusionsWe review the most advanced strategies for integrating multi-omics datasets, focusing on mathematical and methodological aspects.
PURPOSE Recurrently mutated genes and chromosomal abnormalities have been identified in myelodysplastic syndromes (MDS). We aim to integrate these genomic features into disease classification and prognostication. METHODS We retrospectively enrolled 2,043 patients. Using Bayesian networks and Dirichlet processes, we combined mutations in 47 genes with cytogenetic abnormalities to identify genetic associations and subgroups. Random-effects Cox proportional hazards multistate modeling was used for developing prognostic models. An independent validation on 318 cases was performed. RESULTS We identify eight MDS groups (clusters) according to specific genomic features. In five groups, dominant genomic features include splicing gene mutations ( SF3B1, SRSF2, and U2AF1) that occur early in disease history, determine specific phenotypes, and drive disease evolution. These groups display different prognosis (groups with SF3B1 mutations being associated with better survival). Specific co-mutation patterns account for clinical heterogeneity within SF3B1- and SRSF2-related MDS. MDS with complex karyotype and/or TP53 gene abnormalities and MDS with acute leukemia–like mutations show poorest prognosis. MDS with 5q deletion are clustered into two distinct groups according to the number of mutated genes and/or presence of TP53 mutations. By integrating 63 clinical and genomic variables, we define a novel prognostic model that generates personally tailored predictions of survival. The predicted and observed outcomes correlate well in internal cross-validation and in an independent external cohort. This model substantially improves predictive accuracy of currently available prognostic tools. We have created a Web portal that allows outcome predictions to be generated for user-defined constellations of genomic and clinical features. CONCLUSION Genomic landscape in MDS reveals distinct subgroups associated with specific clinical features and discrete patterns of evolution, providing a proof of concept for next-generation disease classification and prognosis.
BackgroundBreast cancer is one of the most common cancer types. Due to the complexity of this disease, it is important to face its study with an integrated and multilevel approach, from genes, transcripts and proteins to molecular networks, cell populations and tissues. According to the systems biology perspective, the biological functions arise from complex networks: in this context, concepts like molecular pathways, protein-protein interactions (PPIs), mathematical models and ontologies play an important role for dissecting such complexity.ResultsIn this work we present the Genes-to-Systems Breast Cancer (G2SBC) Database, a resource which integrates data about genes, transcripts and proteins reported in literature as altered in breast cancer cells. Beside the data integration, we provide an ontology based query system and analysis tools related to intracellular pathways, PPIs, protein structure and systems modelling, in order to facilitate the study of breast cancer using a multilevel perspective. The resource is available at the URL http://www.itb.cnr.it/breastcancer.ConclusionsThe G2SBC Database represents a systems biology oriented data integration approach devoted to breast cancer. By means of the analysis capabilities provided by the web interface, it is possible to overcome the limits of reductionist resources, enabling predictions that can lead to new experiments.
Three haplotype blocks were identified in the CaSR gene. The first block was characterized by six SNPs and included gene promoters. The rs7652589 and rs1501899 SNPs and the CATTCA haplotype of the first block were significantly more frequent in normocitraturic calcium kidney stone formers than controls. The risk of stones was increased in normocitraturic homozygous patients and heterozygotes for the CATTCA haplotype. The rate of stones was higher in stone formers with the CATTCA haplotype. In a three-generation family, calcium stones were associated with the CATTCA haplotype. The bioinformatic analysis identified a new site for the octamer-binding transcription factor 1 in the presence of the variant alleles at the rs7652589 and rs1501899 SNPs. This transcription factor may downregulate the transcription of vitamin D-dependent genes and the CaSR expression. Conclusion. SNPs and CATTCA haplotype of the CaSR gene first block is associated with kidney stones in normocitraturic patients.
Minor allele at rs6776158 may predispose to calcium stones by decreasing transcriptional activity of the CaSR gene promoter 1 and CaSR expression in kidney tubules.
A relation exists between network proximity of molecular entities in interaction networks, functional similarity and association with diseases. The identification of network regions associated with biological functions and pathologies is a major goal in systems biology. We describe a network diffusion-based pipeline for the interpretation of different types of omics in the context of molecular interaction networks. We introduce the network smoothing index, a network-based quantity that allows to jointly quantify the amount of omics information in genes and in their network neighbourhood, using network diffusion to define network proximity. The approach is applicable to both descriptive and inferential statistics calculated on omics data. We also show that network resampling, applied to gene lists ranked by quantities derived from the network smoothing index, indicates the presence of significantly connected genes. As a proof of principle, we identified gene modules enriched in somatic mutations and transcriptional variations observed in samples of prostate adenocarcinoma (PRAD). In line with the local hypothesis, network smoothing index and network resampling underlined the existence of a connected component of genes harbouring molecular alterations in PRAD.
Signal transduction and gene regulation determine a major reorganization of metabolic activities in order to support cell proliferation. Protein Kinase B (PKB), also known as Akt, participates in the PI3K/Akt/mTOR pathway, a master regulator of aerobic glycolysis and cellular biosynthesis, two activities shown by both normal and cancer proliferating cells. Not surprisingly considering its relevance for cellular metabolism, Akt/PKB is often found hyperactive in cancer cells. In the last decade, many efforts have been made to improve the understanding of the control of glucose metabolism and the identification of a therapeutic window between proliferating cancer cells and proliferating normal cells. In this context, we have modeled the link between the PI3K/Akt/mTOR pathway, glycolysis, lactic acid production, and nucleotide biosynthesis. We used a computational model to compare two metabolic states generated by two different levels of signaling through the PI3K/Akt/mTOR pathway: one of the two states represents the metabolism of a growing cancer cell characterized by aerobic glycolysis and cellular biosynthesis, while the other state represents the same metabolic network with a reduced glycolytic rate and a higher mitochondrial pyruvate metabolism. Biochemical reactions that link glycolysis and pentose phosphate pathway revealed their importance for controlling the dynamics of cancer glucose metabolism.
scite is a Brooklyn-based startup that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2023 scite Inc. All rights reserved.
Made with 💙 for researchers