Background: Phenotype prediction problems are usually considered ill-posed, as the amount of samples is very limited with respect to the scrutinized genetic probes. This fact complicates the sampling of the defective genetic pathways due to the high number of possible discriminatory genetic networks involved. In this research, we outline three novel sampling algorithms utilized to identify, classify and characterize the defective pathways in phenotype prediction problems, such as the Fisher's ratio sampler, the Holdout sampler and the Random sampler, and apply each one to the analysis of genetic pathways involved in tumor behavior and outcomes of triple negative breast cancers (TNBC). Altered biological pathways are identified using the most frequently sampled genes and are compared to those obtained via Bayesian Networks (BNs). Results: Random, Fisher's ratio and Holdout samplers were more accurate and robust than BNs, while providing comparable insights about disease genomics. Conclusions: The three samplers tested are good alternatives to Bayesian Networks since they are less computationally demanding algorithms. Importantly, this analysis confirms the concept of "biological invariance" since the altered pathways should be independent of the sampling methodology and the classifier used for their inference. Nevertheless, still some modifications are needed in the Bayesian networks to be able to sample correctly the uncertainty space in phenotype prediction problems, since the probabilistic parameterization of the uncertainty space is not unique and the use of the optimum network might falsify the pathways analysis. L Ã ðgÞ: g ∈ R s →C ¼ f1; 2g ð 1Þ The simplest case is to divide the phenotype in healthy controls and disease samples, but others problems
Discrimination of case-control status based on gene expression differences has potential to identify novel pathways relevant to neurodegenerative diseases including Parkinson’s disease (PD). In this paper we applied two different novel algorithms to predict dysregulated pathways of gene expression across several different regions of the brain in PD and controls. The Fisher’s ratio sampler uses the Fisher’s ratio of the most discriminatory genes as prior probability distribution to sample the genetic networks and their likelihood (accuracy) was established via Leave-One-Out-Cross Validation (LOOCV). The holdout sampler finds the minimum-scale signatures corresponding to different random holdouts, establishing their likelihood using the validation dataset in each holdout. Phenotype prediction problems have by genesis a very high underdetermined character. We used both approaches to sample different lists of genes that optimally discriminate PD from controls and subsequently used gene ontology to identify pathways affected by disease. Both algorithms identified common pathways of Insulin signaling, FOXA1 Transcription Factor Network, HIF-1 Signaling, p53 Signaling and Chromatin Regulation/Acetylation. This analysis provides new therapeutic targets to treat PD.
We present the analysis of defective pathways in multiple myeloma (MM) using two recently developed sampling algorithms of the biological pathways: The Fisher’s ratio sampler, and the holdout sampler. We performed the retrospective analyses of different gene expression datasets concerning different aspects of the disease, such as the existing difference between bone marrow stromal cells in MM and healthy controls (HC), the gene expression profiling of CD34+ cells in MM and HC, the difference between hyperdiploid and non-hyperdiploid myelomas, and the prediction of the chromosome 13 deletion, to provide a deeper insight into the molecular mechanisms involved in the disease. Our analysis has shown the importance of different altered pathways related to glycosylation, infectious disease, immune system response, different aspects of metabolism, DNA repair, protein recycling and regulation of the transcription of genes involved in the differentiation of myeloid cells. The main difference in genetic pathways between hyperdiploid and non-hyperdiploid myelomas are related to infectious disease, immune system response and protein recycling. Our work provides new insights on the genetic pathways involved in this complex disease and proposes novel targets for future therapies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.