Abstract:Recent cancer sequencing studies provide a wealth of somatic mutation data from a large number of patients. One of the most intriguing and challenging questions arising from this data is to determine whether the temporal order of somatic mutations in a cancer follows any common progression. Since we usually obtain only one sample from a patient, such inferences are commonly made from cross-sectional data from different patients. This analysis is complicated by the extensive variation in the somatic mutations a… Show more
“…Vandin et al [20] introduced Pathway Linear Progression Model (PLPM) which was defined for an integer > 1 as an integer linear program problem of looking for * = ∈ ( ) ( , ), and showed that the problem is an NP-hard problem. To solve it more efficiently, we construct a weighted gene network based on exclusive degree between each pair of genes to simplify the relationships between the genes and to significantly reduce the computational complexity.…”
Section: Constructing a Gene Network Based On Approximate Exclusivitymentioning
confidence: 99%
“…Second, how to detect driver pathways, which are frequently perturbed with a large number of tumor cells, and give rise to the product of tumorigenic properties, such as cell angiogenesis, proliferation or metastasis [1], [2], [3], [12], [13], [14], [15], [16], [17]. Third, how to determine temporal orders of the driver mutations in cancer patients [18], [19], [20], [21], [22]. The first question can usually be solved by comparing mutation frequencies across different individuals [6], [7], [8], [9], [10], [11].…”
Section: Introductionmentioning
confidence: 99%
“…However, it is almost impossible to obtain samples at multiple time-points from a single individual, therefore, it is difficult to answer the question about temporal progression and identify what mutations occur early in cancer progression [18], [19], [20], [21], [22]. One systematic approach to address the task is to identify mutually exclusive gene sets in cancer genomic data [1], [2], [3], [12], [13], [14], [15], [16], [23].…”
Section: Introductionmentioning
confidence: 99%
“…Several methods have been introduced to infer temporal progression of gene mutations from cross-sectional data [18], [19], [20], [25], [26], [27], [28]. Desper et al [25], [26] proposed a tree model inference algorithm based on the thought of maximum-weight which relates cancer progression to measurement on gains and losses of chromosomal regions in tumor cells.…”
Section: Introductionmentioning
confidence: 99%
“…The problem with these approaches is that cancers usually exhibit [2], [3], [12], [13], [14], [15], [16], [23] have indicated that driver mutations in the same pathway tend to be mutually exclusive, that is, most patients have no more than one mutation within the same pathway. Therefore, Vandin et al [20] introduced the exclusivity among mutations (genes) within the same pathway to infer cancer pathways and tumor progression from cross-sectional mutation data. They formulated the Pathway Linear Progression problem as an integer linear program.…”
Abstract-Large-scale cancer genomics projects are providing a wealth of somatic mutation data from a large number of cancer patients. However, it is difficult to obtain several samples with a temporal order from one patient in evaluating the cancer progression. Therefore, one of the most challenging problems arising from the data is to infer the temporal order of mutations across many patients. To solve the problem efficiently, we present a Network-based method (NetInf) to Infer cancer progression at the pathway level from cross-sectional data across many patients, leveraging on the exclusive property of driver mutations within a pathway and the property of linear progression between pathways. To assess the robustness of NetInf, we apply it on simulated data with the addition of different levels of noise. To verify the performance of NetInf, we apply it to analyze somatic mutation data from three real cancer studies with large number of samples. Experimental results reveal that the pathways detected by NetInf show significant enrichment. Our method reduces computational complexity by constructing gene networks without assigning the number of pathways, which also provides new insights on the temporal order of somatic mutations at the pathway level rather than at the gene level.
“…Vandin et al [20] introduced Pathway Linear Progression Model (PLPM) which was defined for an integer > 1 as an integer linear program problem of looking for * = ∈ ( ) ( , ), and showed that the problem is an NP-hard problem. To solve it more efficiently, we construct a weighted gene network based on exclusive degree between each pair of genes to simplify the relationships between the genes and to significantly reduce the computational complexity.…”
Section: Constructing a Gene Network Based On Approximate Exclusivitymentioning
confidence: 99%
“…Second, how to detect driver pathways, which are frequently perturbed with a large number of tumor cells, and give rise to the product of tumorigenic properties, such as cell angiogenesis, proliferation or metastasis [1], [2], [3], [12], [13], [14], [15], [16], [17]. Third, how to determine temporal orders of the driver mutations in cancer patients [18], [19], [20], [21], [22]. The first question can usually be solved by comparing mutation frequencies across different individuals [6], [7], [8], [9], [10], [11].…”
Section: Introductionmentioning
confidence: 99%
“…However, it is almost impossible to obtain samples at multiple time-points from a single individual, therefore, it is difficult to answer the question about temporal progression and identify what mutations occur early in cancer progression [18], [19], [20], [21], [22]. One systematic approach to address the task is to identify mutually exclusive gene sets in cancer genomic data [1], [2], [3], [12], [13], [14], [15], [16], [23].…”
Section: Introductionmentioning
confidence: 99%
“…Several methods have been introduced to infer temporal progression of gene mutations from cross-sectional data [18], [19], [20], [25], [26], [27], [28]. Desper et al [25], [26] proposed a tree model inference algorithm based on the thought of maximum-weight which relates cancer progression to measurement on gains and losses of chromosomal regions in tumor cells.…”
Section: Introductionmentioning
confidence: 99%
“…The problem with these approaches is that cancers usually exhibit [2], [3], [12], [13], [14], [15], [16], [23] have indicated that driver mutations in the same pathway tend to be mutually exclusive, that is, most patients have no more than one mutation within the same pathway. Therefore, Vandin et al [20] introduced the exclusivity among mutations (genes) within the same pathway to infer cancer pathways and tumor progression from cross-sectional mutation data. They formulated the Pathway Linear Progression problem as an integer linear program.…”
Abstract-Large-scale cancer genomics projects are providing a wealth of somatic mutation data from a large number of cancer patients. However, it is difficult to obtain several samples with a temporal order from one patient in evaluating the cancer progression. Therefore, one of the most challenging problems arising from the data is to infer the temporal order of mutations across many patients. To solve the problem efficiently, we present a Network-based method (NetInf) to Infer cancer progression at the pathway level from cross-sectional data across many patients, leveraging on the exclusive property of driver mutations within a pathway and the property of linear progression between pathways. To assess the robustness of NetInf, we apply it on simulated data with the addition of different levels of noise. To verify the performance of NetInf, we apply it to analyze somatic mutation data from three real cancer studies with large number of samples. Experimental results reveal that the pathways detected by NetInf show significant enrichment. Our method reduces computational complexity by constructing gene networks without assigning the number of pathways, which also provides new insights on the temporal order of somatic mutations at the pathway level rather than at the gene level.
High‐throughput DNA sequencing techniques enable large‐scale measurement of somatic mutations in tumors. Cancer genomics research aims at identifying all cancer‐related genes and solid interpretation of their contribution to cancer initiation and development. However, this venture is characterized by various challenges, such as the high number of neutral passenger mutations and the complexity of the biological networks affected by driver mutations. Based on biological pathway and network information, sophisticated computational methods have been developed to facilitate the detection of cancer driver mutations and pathways. They can be categorized into (1) methods using known pathways from public databases, (2) network‐based methods, and (3) methods learning cancer pathways de novo. Methods in the first two categories use and integrate different types of data, such as biological pathways, protein interaction networks, and gene expression measurements. The third category consists of de novo methods that detect combinatorial patterns of somatic mutations across tumor samples, such as mutual exclusivity and co‐occurrence. In this review, we discuss recent advances, current limitations, and future challenges of these approaches for detecting cancer genes and pathways. We also discuss the most important current resources of cancer‐related genes. WIREs Syst Biol Med 2017, 9:e1364. doi: 10.1002/wsbm.1364For further resources related to this article, please visit the WIREs website.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.