We introduce Pathifier, an algorithm that infers pathway deregulation scores for each tumor sample on the basis of expression data. This score is determined, in a context-specific manner, for every particular dataset and type of cancer that is being investigated. The algorithm transforms gene-level information into pathway-level information, generating a compact and biologically relevant representation of each sample. We demonstrate the algorithm's performance on three colorectal cancer datasets and two glioblastoma multiforme datasets and show that our multipathway-based representation is reproducible, preserves much of the original information, and allows inference of complex biologically significant information. We discovered several pathways that were significantly associated with survival of glioblastoma patients and two whose scores are predictive of survival in colorectal cancer: CXCR3-mediated signaling and oxidative phosphorylation. We also identified a subclass of proneural and neural glioblastoma with significantly better survival, and an EGF receptor-deregulated subclass of colon cancers.computational biology | systems biology | oncogenomics | principal curve T he operation of many important pathways is altered during cancer initiation and progression. Identifying the involved pathways and quantifying their deregulation is a very important step toward understanding the malignancy process (1-5). Because advanced therapies target specific pathways, pathway-level understanding is a key step also for developing personalized cancer treatments. Indeed, many methods, such as those described in refs. 5-10, were developed for pathway analysis of high-throughput data. Nearly all methods characterize a pathway's activity for an entire sample set and do not provide information on its deregulation in a particular tumor. One prominent exception is Pathway Recognition Algorithm using Data Integration on Genomic Models (PARADIGM) (11), a tool that deduces for each pathway and sample a score using the pathway's known connectivity and functional structure. Hence, it may not work well for many complex pathways that play significant roles in cancer, for which either the mechanism of pathway activity is not well known or essential relevant data (such as protein abundance and phosphorylation status) are unavailable.We introduce a method to calculate, independently for every pathway, a score that represents the extent to which the pathway is deregulated in every individual sample. We quantify the level of deregulation of a pathway in a sample by measuring the deviation of the sample from normal behavior. We do not need detailed reliable knowledge of the network or wiring diagram that underlies the pathway's activity. Hence, our estimates of pathway deregulation in a given sample are not restricted to only simple pathways. The method is knowledge-based, because we use generally well-known external information on the identity of the genes that belong to each pathway. Since the detailed interactions in each pathway are largel...