Subclonal reconstruction of tumors by using machine learning and population genetics

Caravagna, Giulio; Heide, Timon; Williams, Marc; Zapata, Luís; Nichol, Daniel; Chkhaidze, Ketevan; Cross, William; Cresswell, George D; Werner, Benjamin; Acar, Ahmet; Chesler, Louis; Barnes, C.; Sanguinetti, Guido; Graham, Trevor A.; Sottoriva, Andrea

doi:10.1038/s41588-020-0675-5

Cited by 94 publications

(224 citation statements)

References 63 publications

(114 reference statements)

Supporting

Mentioning

207

Contrasting

Order By: Relevance

“…This is mainly due to the fact that cancer follows distinct evolutionary trajectories in patients compared to their genomic landscapes, not only during the initiation and metastasis cascade of cancer cells but also in response to the treatment in cancer therapies [18,19]. For this reason, the accurate identification of subclonal drivers holds great importance for the timing of the subclonal expansion and its diversity in cancer therapies [20]. This sophisticated subclonal identification tool, empowered by machine learning and population genetics, will potentially lead to developing more comprehensive computational methods by integrating with network-driven approaches for cancer systems biology in the future.…”

Section: Cancer Systems Biology For Precision Medicinementioning

confidence: 99%

Systems Biology and Experimental Model Systems of Cancer

Yalçın

Danisik

Baygin

et al. 2020

JPM

Self Cite

View full text Add to dashboard Cite

Over the past decade, we have witnessed an increasing number of large-scale studies that have provided multi-omics data by high-throughput sequencing approaches. This has particularly helped with identifying key (epi)genetic alterations in cancers. Importantly, aberrations that lead to the activation of signaling networks through the disruption of normal cellular homeostasis is seen both in cancer cells and also in the neighboring tumor microenvironment. Cancer systems biology approaches have enabled the efficient integration of experimental data with computational algorithms and the implementation of actionable targeted therapies, as the exceptions, for the treatment of cancer. Comprehensive multi-omics data obtained through the sequencing of tumor samples and experimental model systems will be important in implementing novel cancer systems biology approaches and increasing their efficacy for tailoring novel personalized treatment modalities in cancer. In this review, we discuss emerging cancer systems biology approaches based on multi-omics data derived from bulk and single-cell genomics studies in addition to existing experimental model systems that play a critical role in understanding (epi)genetic heterogeneity and therapy resistance in cancer.

show abstract

Section: Cancer Systems Biology For Precision Medicinementioning

confidence: 99%

Systems Biology and Experimental Model Systems of Cancer

Yalçın

Danisik

Baygin

et al. 2020

JPM

Self Cite

View full text Add to dashboard Cite

show abstract

“…Here, the frequency distribution of variants queried from low depth calls were left-tail heavy although the pure distribution is expected to follow a beta-binomial distribution. Extending from a one-parametric power law function 10 , we modelled the reduction in variability biased towards the left tail with a log-exponent function, a lognormal prior. Samples were drawn from the closed form cumulative function upon which accurate predictions were observed for those with purity down to 40%, scaled against a complementary sample with purity of at least 70%.…”

Section: Figurementioning

confidence: 99%

Tracing the evolution of aneuploid cancers by multiregional sequencing with CRUST

Chattopadhyay

Karlsson

Valind

et al. 2020

Preprint

View full text Add to dashboard Cite

To understand the evolutionary dynamics of cancer, clonal deconvolution of mutational landscapes across multiple biopsies from the same patient is crucial. However, the frequencies of mutated alleles are often distorted by variation in copy number of mutated loci as well as the purity across samples. We present a semi-supervised algorithm that normalizes for purity and incorporates allelic composition with bulk sequencing to reliably segregate clonal/subclonal variants even at low sequencing depth (∼50x). In presence of at least one tumor sample with >70% purity, it deconvolves samples down to ∼40% purity, allowing robust tracking of mutated cell populations through cancer evolution.

show abstract

“…Using bioinformatic tools to cross reference the normal genome against the aberrant one, the mutations and heterogeneity thereof found in the tumour sample can be derived and used in other analyses. These analyses include, but are not limited to, driver mutation identification (Bailey et al 2018;Gonzalez-Perez et al 2013) , which aims to discern the key aberrations that cause a tumour to grow, patient clustering, which aims to identify treatment groups with similar biological characteristics, and evolutionary inference (Gerstung et al 2020;Nik-Zainal et al 2012;Caravagna et al 2020) , which informs us how a particular tumour developed from normal cells.…”

Section: Introductionmentioning

confidence: 99%

“…Alternatively, if each chromosome is present in three copies (triploid), the expected VAF is 33% -if the SNV occurred after the amplification -or 66% -if the SNV is on the amplified chromosome and occurred before the amplification. The theoretical frequencies are observed with a Binomial noise model that depends on the depth of sequencing and the actual VAF (Nik-Zainal et al 2012;Caravagna et al 2020) . We note that these VAFs hold for pure bulk tumour samples (100% tumour cells).…”

Section: Introductionmentioning

confidence: 99%

Computational validation of clonal and subclonal copy number alterations from bulk tumour sequencing

Househam

Bergamin

Milite

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Cancer is a global health issue that places enormous demands on healthcare systems. Basic research, the development of targeted treatments, and the utility of DNA sequencing in clinical settings, have been significantly improved with the introduction of whole genome sequencing. However the broad applications of this technology come with complications. To date there has been very little standardisation in how data quality is assessed, leading to inconsistencies in analyses and disparate conclusions. Manual checking and complex consensus calling strategies often do not scale to large sample numbers, which leads to procedural bottlenecks. To address this issue, we present a quality control method that integrates point mutations, copy numbers, and other metrics into a single quantitative score. We demonstrate its power on 1,065 whole-genomes from a large-scale pan-cancer cohort, and on multi-region data of two colorectal cancer patients. We highlight how our approach significantly improves the generation of cancer mutation data, providing visualisations for cross-referencing with other analyses. Our approach is fully automated, designed to work downstream of any bioinformatic pipeline, and can automatise tool parameterization paving the way for fast computational assessment of data quality in the era of whole genome sequencing.

show abstract

Subclonal reconstruction of tumors by using machine learning and population genetics

Cited by 94 publications

References 63 publications

Systems Biology and Experimental Model Systems of Cancer

Systems Biology and Experimental Model Systems of Cancer

Tracing the evolution of aneuploid cancers by multiregional sequencing with CRUST

Computational validation of clonal and subclonal copy number alterations from bulk tumour sequencing

Contact Info

Product

Resources

About