2017
DOI: 10.1007/978-3-319-56970-3_20
|View full text |Cite
|
Sign up to set email alerts
|

The Copy-Number Tree Mixture Deconvolution Problem and Applications to Multi-sample Bulk Sequencing Tumor Data

Abstract: Abstract.Cancer is an evolutionary process driven by somatic mutation. This process can be represented as a phylogenetic tree. Constructing such a phylogenetic tree from genome sequencing data is a challenging task due to the mutational complexity of cancer and the fact that nearly all cancer sequencing is of bulk tissue, measuring a superposition of somatic mutations present in different cells. We study the problem of reconstructing tumor phylogenies from copy number aberrations (CNAs) measured in bulk-sequen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
21
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
4
4
1

Relationship

3
6

Authors

Journals

citations
Cited by 19 publications
(21 citation statements)
references
References 30 publications
(81 reference statements)
0
21
0
Order By: Relevance
“…This deconvolution is complicated as both the CNAs and the proportion of cells originating from each clone in the mixture are unknown; in general the deconvolution problem is underdetermined with multiple equivalent solutions. In the past few years, over a dozen methods have been developed to solve different simplified versions of this copy-number deconvolution problem 6,9,[14][15][16][17][18][19][20][21][22][23][24][25][26][27] . These methods rely on various simplifying assumptions such as only one tumor clone is present in the mixture, WGDs are not present, etc.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…This deconvolution is complicated as both the CNAs and the proportion of cells originating from each clone in the mixture are unknown; in general the deconvolution problem is underdetermined with multiple equivalent solutions. In the past few years, over a dozen methods have been developed to solve different simplified versions of this copy-number deconvolution problem 6,9,[14][15][16][17][18][19][20][21][22][23][24][25][26][27] . These methods rely on various simplifying assumptions such as only one tumor clone is present in the mixture, WGDs are not present, etc.…”
Section: Introductionmentioning
confidence: 99%
“…First, HATCHet solves a simultaneous matrix factorization problem which models allele-specific copy numbers, the dependencies between genomic segments across clones, and the dependencies between clones across samples. In contrast, existing methods do not infer allele-specific copy numbers [20][21][22][23][24] , consider each segment independently 6,9,[14][15][16][17][18][19] , do not preserve clonal structure across samples 21,22,26,27 , or assume all samples comprise the same set of few clones 25 . Second, HATCHet globally clusters RDR and BAF jointly along the genome and across all samples, while existing methods rely on local clustering of neighboring loci.…”
Section: Introductionmentioning
confidence: 99%
“…Such novel telomeres can correspond to real telomeres, but in many cases are likely due to missing novel adjacencies in the input data. Second, we can further generalize RCK to simultaneously analyze multiple samples from the same individual, perhaps including a phylogenetic [62] or longitudinal constraints [38]. Simultaneously analysis of multiple samples has proved useful in copy number inference [63].…”
Section: Discussionmentioning
confidence: 99%
“…In order to assess the statistical signi cance of subnetworks discovered by cd-CAP -in the single-subnetwork mode, we introduce for the rst time a model in which likely interdependent events, in particular ampli cation or deletion of all genes in a single chromosome arm, are considered as a single event. Conventional models of gene ampli cation either consider each gene ampli cation independently [31] (this is the model we implicitly assume in our combinatorial optimization formulations, giving a lower bound on the true pvalue), or assumes each ampli cation can involve more than one gene (forming a subsequent sequence of genes) but with the added assumption that the original gene structure is not altered and the duplications occur in some orthogonal "dimension" [32,33,34]. Both models have their assumptions that do not hold in reality but are motivated by computational constraints: inferring evolutionary history of a genome with arbitrary duplications (that convert one string to another, longer string, by copying arbitrary substrings to arbitrary destinations) is an NP-hard problem (and is di cult to solve even approximately) [35,36].…”
Section: Our Contributionsmentioning
confidence: 99%