Motivation Survival analysis methods that integrate pathways/gene sets into their learning model could identify molecular mechanisms that determine survival characteristics of patients. Rather than first picking the predictive pathways/gene sets from a given collection and then training a predictive model on the subset of genomic features mapped to these selected pathways/gene sets, we developed a novel machine learning algorithm (Path2Surv) that conjointly performs these two steps using multiple kernel learning. Results We extensively tested our Path2Surv algorithm on 7655 patients from 20 cancer types using cancer-specific pathway/gene set collections and gene expression profiles of these patients. Path2Surv statistically significantly outperformed survival random forest (RF) on 12 out of 20 datasets and obtained comparable predictive performance against survival support vector machine (SVM) using significantly fewer gene expression features (i.e. less than 10% of what survival RF and survival SVM used). Availability and implementation Our implementations of survival SVM and Path2Surv algorithms in R are available at https://github.com/mehmetgonen/path2surv together with the scripts that replicate the reported experiments. Supplementary information Supplementary data are available at Bioinformatics online.
Evolutionary conservation is a fundamental resource for predicting the substitutability of amino acids and loss of function in proteins. The use of multiple sequence alignment alone—without considering the evolutionary relationships among sequences—results in the redundant counting of evolutionarily related alteration events as if they were independent. Here we propose a new method, PHACT that predicts the pathogenicity of missense mutations directly from the phylogenetic tree of proteins. PHACT travels through the nodes of the phylogenetic tree and evaluates the deleteriousness of a substitution based on the probability differences of ancestral amino acids between neighboring nodes in the tree. Moreover, PHACT assigns weights to each node in the tree based on their distance to the query organism. For each potential amino acid substitution, the algorithm generates a score that is used to calculate the effect of substitution on protein function. To analyze the predictive performance of PHACT, we performed various experiments over the subsets of two datasets that include 3023 proteins and 61662 variants in total. The experiments demonstrated that our method outperformed the widely used pathogenicity prediction tools (i.e., SIFT and PolyPhen-2) and achieved better predictive performance than did other conventional statistical approaches presented in dbNSFP. The PHACT source code is available at https://github.com/CompGenomeLab/PHACT.
Major Depressive Disorder (MDD) is a commonly observed psychiatric disorder that affects more than 2% of the world population with a rising trend. However, disease-associated pathways and biomarkers are yet to be fully comprehended. In this study, we analyzed previously generated RNA-seq data across seven different brain regions from three distinct studies to identify differentially and co-expressed genes for patients with MDD. Differential gene expression (DGE) analysis revealed that NPAS4 is the only gene downregulated in three different brain regions. Furthermore, co-expressing gene modules responsible for glutamatergic signaling are negatively enriched in these regions. We used the results of both DGE and co-expression analyses to construct a novel MDD-associated pathway. In our model, we propose that disruption in glutamatergic signaling-related pathways might be associated with the downregulation of NPAS4 and many other immediate-early genes (IEGs) that control synaptic plasticity. In addition to DGE analysis, we identified the relative importance of KEGG pathways in discriminating MDD phenotype using a machine learning-based approach. We anticipate that our study will open doors to developing better therapeutic approaches targeting glutamatergic receptors in the treatment of MDD.
The Calcium Sensing Receptor (CaSR) is very important in controlling the levels of calcium in the body by interacting with different types of G-protein. This receptor is highly conserved among other G-protein coupled receptors (GPCRs) and has been linked to disorders affecting the balance of calcium in the body, such as hypercalcemia and hypocalcemia. Although there has been progress in understanding the structure and function of CaSR, there is still a lack of knowledge about which specific residues are important for their function and how it differs from other receptors in the same class. In this study, we used phylogeny-based methods to identify functionally-equivalent orthologs of CaSR, predict the importance of each residue, and calculate specificity-determining position (SDP) scores to uncover the evolutionary basis of its function. Our results showed that the CaSR subfamily is highly conserved, with higher SDP scores than its closest receptor subfamilies. Residues with high SDP scores are likely to be critical in receptor activation and pathogenicity. We applied gradient-boosting trees with evolutionary metrics as inputs to predict the functional consequences of each substitution, and discriminate between gain and loss-of-function mutations those causing hypo- and hypercalcemia, respectively. Our study provides insight into the evolutionary fine-tuning of CaSR, which can help understand its role in calcium balance and related disorders.
Major Depressive Disorder (MDD) is a commonly observed psychiatric disorder that affects more than 2% of the world population with a rising trend. However, disease-associated pathways and biomarkers are yet to be fully comprehended. In this study, we analyzed previously generated RNA-seq data across seven different brain regions from three distinct studies to identify differentially and co-expressed genes for patients with MDD. Differential gene expression (DGE) analysis revealed that NPAS4 is the only gene downregulated in three different brain regions. Furthermore, co-expressing gene modules responsible for glutamatergic signaling are negatively enriched in these regions. We used the results of both DGE and co-expression analyses to construct a novel MDD-associated pathway. In our model, we propose that disruption in glutamatergic signaling-related pathways might be associated with the downregulation of NPAS4 and many other immediate-early genes (IEGs) that control synaptic plasticity. In addition to DGE analysis, we identified the relative importance of KEGG pathways in discriminating MDD phenotype using a machine learning-based approach. We anticipate that our study will open doors to developing better therapeutic approaches targeting glutamatergic receptors in the treatment of MDD.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.