2020
DOI: 10.1038/s41588-020-0675-5
|View full text |Cite
|
Sign up to set email alerts
|

Subclonal reconstruction of tumors by using machine learning and population genetics

Abstract: The vast majority of cancer next-generation sequencing data consist of bulk samples composed of mixtures of cancer and normal cells. To study tumor evolution, subclonal reconstruction approaches based on machine learning are used to separate subpopulation of cancer cells and reconstruct their ancestral relationships. However, current approaches are entirely data-driven and agnostic to evolutionary theory. We demonstrate that systematic errors occur in subclonal reconstruction if tumor evolution is not accounte… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

6
207
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
2

Relationship

3
6

Authors

Journals

citations
Cited by 94 publications
(224 citation statements)
references
References 63 publications
(114 reference statements)
6
207
0
Order By: Relevance
“…This is mainly due to the fact that cancer follows distinct evolutionary trajectories in patients compared to their genomic landscapes, not only during the initiation and metastasis cascade of cancer cells but also in response to the treatment in cancer therapies [18,19]. For this reason, the accurate identification of subclonal drivers holds great importance for the timing of the subclonal expansion and its diversity in cancer therapies [20]. This sophisticated subclonal identification tool, empowered by machine learning and population genetics, will potentially lead to developing more comprehensive computational methods by integrating with network-driven approaches for cancer systems biology in the future.…”
Section: Cancer Systems Biology For Precision Medicinementioning
confidence: 99%
“…This is mainly due to the fact that cancer follows distinct evolutionary trajectories in patients compared to their genomic landscapes, not only during the initiation and metastasis cascade of cancer cells but also in response to the treatment in cancer therapies [18,19]. For this reason, the accurate identification of subclonal drivers holds great importance for the timing of the subclonal expansion and its diversity in cancer therapies [20]. This sophisticated subclonal identification tool, empowered by machine learning and population genetics, will potentially lead to developing more comprehensive computational methods by integrating with network-driven approaches for cancer systems biology in the future.…”
Section: Cancer Systems Biology For Precision Medicinementioning
confidence: 99%
“…Here, the frequency distribution of variants queried from low depth calls were left-tail heavy although the pure distribution is expected to follow a beta-binomial distribution. Extending from a one-parametric power law function 10 , we modelled the reduction in variability biased towards the left tail with a log-exponent function, a lognormal prior. Samples were drawn from the closed form cumulative function upon which accurate predictions were observed for those with purity down to 40%, scaled against a complementary sample with purity of at least 70%.…”
Section: Figurementioning
confidence: 99%
“…Using bioinformatic tools to cross reference the normal genome against the aberrant one, the mutations and heterogeneity thereof found in the tumour sample can be derived and used in other analyses. These analyses include, but are not limited to, driver mutation identification (Bailey et al 2018;Gonzalez-Perez et al 2013) , which aims to discern the key aberrations that cause a tumour to grow, patient clustering, which aims to identify treatment groups with similar biological characteristics, and evolutionary inference (Gerstung et al 2020;Nik-Zainal et al 2012;Caravagna et al 2020) , which informs us how a particular tumour developed from normal cells.…”
Section: Introductionmentioning
confidence: 99%
“…Alternatively, if each chromosome is present in three copies (triploid), the expected VAF is 33% -if the SNV occurred after the amplification -or 66% -if the SNV is on the amplified chromosome and occurred before the amplification. The theoretical frequencies are observed with a Binomial noise model that depends on the depth of sequencing and the actual VAF (Nik-Zainal et al 2012;Caravagna et al 2020) . We note that these VAFs hold for pure bulk tumour samples (100% tumour cells).…”
Section: Introductionmentioning
confidence: 99%