Single-cell barcoding technologies have recently been used to perform whole-genome sequencing of thousands of individual cells in parallel. These technologies provide the opportunity to characterize genomic heterogeneity at single-cell resolution, but their extremely low sequencing coverage (ă0.05X per cell) has thus far restricted their use to identification of the total copy number of large multi-megabase segments in individual cells. However, total copy numbers do not distinguish between the two homologous chromosomes in humans, and thus provide a limited view of tumor heterogeneity and evolution missing important events such as copy-neutral loss-of-heterozygosity (LOH).We introduce CHISEL, the first method to infer allele-and haplotype-specific copy numbers in single cells and subpopulations of cells by aggregating sparse signal across thousands of individual cells. We applied CHISEL to 10 single-cell sequencing datasets from 2 breast cancer patients, each dataset containing «2 000 cells. We identified extensive allele-specific copy-number aberrations (CNAs) in these samples including copy-neutral LOH, whole-genome duplications (WGDs), and mirrored-subclonal CNAs in subpopulations of cells. These allele-specific CNAs alter the copy number of genomic regions containing well-known breast cancer genes including TP53, BRCA2, and PTEN but are invisible to total copy number analysis. We utilized CHISEL's allele-and haplotype-specific copy numbers to derive a more refined reconstruction of tumor evolution: timing allele-specific CNAs before and after WGDs, identifying low-frequency subclones distinguished by unique CNAs, and uncovering evidence of convergent evolution. This reconstruction is supported by orthogonal analysis of somatic single-nucleotide variants (SNVs) obtained by pooling barcoded reads across clones defined by CHISEL.
Copy-number aberrations (CNAs) and whole-genome duplications (WGDs) are frequent somatic mutations in cancer but their quantification from DNA sequencing of bulk tumor samples is challenging. Standard methods for CNA inference analyze tumor samples individually; however, DNA sequencing of multiple samples from a cancer patient has recently become more common. We introduce HATCHet (Holistic Allele-specific Tumor Copynumber Heterogeneity), an algorithm that infers allele-and clone-specific CNAs and WGDs jointly across multiple tumor samples from the same patient. We show that HATCHet outperforms current state-of-the-art methods on multi-sample DNA sequencing data that we simulate using MASCoTE (Multiple Allele-specific Simulation of Copy-number Tumor Evolution). Applying HATCHet to 84 tumor samples from 14 prostate and pancreas cancer patients, we identify subclonal CNAs and WGDs that are more plausible than previously published analyses and more consistent with somatic single-nucleotide variants (SNVs) and small indels in the same samples.
Highlights d Single-nucleotide variants (SNVs) and CNAs are markers of cancer evolution d Copy-number aberrations (CNAs) may overlap SNVs and result in SNV loss d Loss-supported model constrains SNV losses to loci with a decrease in copy number d SCARLET integrates SNVs and CNAs yielding more accurate single-cell phylogenies
Environmental carcinogenic exposures are major contributors to global disease burden yet how they promote cancer is unclear. Over 70 years ago, the concept of tumour promoting agents driving latent clones to expand was rst proposed. In support of this model, recent evidence suggests that human tissue contains a patchwork of mutant clones, some of which harbour oncogenic mutations, and many environmental carcinogens lack a clear mutational signature. We hypothesised that the environmental carcinogen, <2.5μm particulate matter (PM2.5), might promote lung cancer promotion through nonmutagenic mechanisms by acting on pre-existing mutant clones within normal tissues in patients with lung cancer who have never smoked, a disease with a high frequency of EGFR activating mutations. We analysed PM2.5 levels and cancer incidence reported by UK Biobank, Public Health England, Taiwan Chang Gung Memorial Hospital (CGMH) and Korean Samsung Medical Centre (SMC) from a total of 463,679 individuals between 2006-2018. We report associations between PM2.5 levels and the incidence of several cancers, including EGFR mutant lung cancer. We nd that pollution on a background of EGFR mutant lung epithelium promotes a progenitor-like cell state and demonstrate that PM accelerates lung cancer progression in EGFR and Kras mutant mouse lung cancer models. Through parallel exposure studies in mouse and human participants, we nd evidence that in ammatory mediators, such as interleukin-1 , may act upon EGFR mutant clones to drive expansion of progenitor cells. Ultradeep mutational pro ling of histologically normal lung tissue from 247 individuals across 3 clinical cohorts revealed oncogenic EGFR and KRAS driver mutations in 18% and 33% of normal tissue samples, respectively. These results support a tumour-promoting role for PM acting on latent mutant clones in normal lung tissue and add to evidence providing an urgent mandate to address air pollution in urban areas.
B cells are frequently found in the margins of solid tumours as organized follicles in ectopic lymphoid organs called tertiary lymphoid structures (TLS)1,2. Although TLS have been found to correlate with improved patient survival and response to immune checkpoint blockade (ICB), the underlying mechanisms of this association remain elusive1,2. Here we investigate lung-resident B cell responses in patients from the TRACERx 421 (Tracking Non-Small-Cell Lung Cancer Evolution Through Therapy) and other lung cancer cohorts, and in a recently established immunogenic mouse model for lung adenocarcinoma3. We find that both human and mouse lung adenocarcinomas elicit local germinal centre responses and tumour-binding antibodies, and further identify endogenous retrovirus (ERV) envelope glycoproteins as a dominant anti-tumour antibody target. ERV-targeting B cell responses are amplified by ICB in both humans and mice, and by targeted inhibition of KRAS(G12C) in the mouse model. ERV-reactive antibodies exert anti-tumour activity that extends survival in the mouse model, and ERV expression predicts the outcome of ICB in human lung adenocarcinoma. Finally, we find that effective immunotherapy in the mouse model requires CXCL13-dependent TLS formation. Conversely, therapeutic CXCL13 treatment potentiates anti-tumour immunity and synergizes with ICB. Our findings provide a possible mechanistic basis for the association of TLS with immunotherapy response.
Motivation: Haplotype assembly is the computational problem of reconstructing haplotypes in diploid organisms and is of fundamental importance for characterizing the effects of single-nucleotide polymorphisms on the expression of phenotypic traits. Haplotype assembly highly benefits from the advent of 'future-generation' sequencing technologies and their capability to produce long reads at increasing coverage. Existing methods are not able to deal with such data in a fully satisfactory way, either because accuracy or performances degrade as read length and sequencing coverage increase or because they are based on restrictive assumptions. Results: By exploiting a feature of future-generation technologies-the uniform distribution of sequencing errors-we designed an exact algorithm, called HAPCOL, that is exponential in the maximum number of corrections for each single-nucleotide polymorphism position and that minimizes the overall error-correction score. We performed an experimental analysis, comparing HAPCOL with the current state-of-the-art combinatorial methods both on real and simulated data. On a standard benchmark of real data, we show that HAPCOL is competitive with state-of-the-art methods, improving the accuracy and the number of phased positions. Furthermore, experiments on realistically simulated datasets revealed that HAPCOL requires significantly less computing resources, especially memory. Thanks to its computational efficiency, HAPCOL can overcome the limits of previous approaches, allowing to phase datasets with higher coverage and without the traditional all-heterozygous assumption.
Lung cancer is the leading cause of cancer-associated mortality worldwide1. Here we analysed 1,644 tumour regions sampled at surgery or during follow-up from the first 421 patients with non-small cell lung cancer prospectively enrolled into the TRACERx study. This project aims to decipher lung cancer evolution and address the primary study endpoint: determining the relationship between intratumour heterogeneity and clinical outcome. In lung adenocarcinoma, mutations in 22 out of 40 common cancer genes were under significant subclonal selection, including classical tumour initiators such as TP53 and KRAS. We defined evolutionary dependencies between drivers, mutational processes and whole genome doubling (WGD) events. Despite patients having a history of smoking, 8% of lung adenocarcinomas lacked evidence of tobacco-induced mutagenesis. These tumours also had similar detection rates for EGFR mutations and for RET, ROS1, ALK and MET oncogenic isoforms compared with tumours in never-smokers, which suggests that they have a similar aetiology and pathogenesis. Large subclonal expansions were associated with positive subclonal selection. Patients with tumours harbouring recent subclonal expansions, on the terminus of a phylogenetic branch, had significantly shorter disease-free survival. Subclonal WGD was detected in 19% of tumours, and 10% of tumours harboured multiple subclonal WGDs in parallel. Subclonal, but not truncal, WGD was associated with shorter disease-free survival. Copy number heterogeneity was associated with extrathoracic relapse within 1 year after surgery. These data demonstrate the importance of clonal expansion, WGD and copy number instability in determining the timing and patterns of relapse in non-small cell lung cancer and provide a comprehensive clinical cancer evolutionary data resource.
BackgroundCancer is an evolutionary process characterized by the accumulation of somatic mutations in a population of cells that form a tumor. One frequent type of mutations is copy number aberrations, which alter the number of copies of genomic regions. The number of copies of each position along a chromosome constitutes the chromosome’s copy-number profile. Understanding how such profiles evolve in cancer can assist in both diagnosis and prognosis.ResultsWe model the evolution of a tumor by segmental deletions and amplifications, and gauge distance from profile to by the minimum number of events needed to transform into . Given two profiles, our first problem aims to find a parental profile that minimizes the sum of distances to its children. Given k profiles, the second, more general problem, seeks a phylogenetic tree, whose k leaves are labeled by the k given profiles and whose internal vertices are labeled by ancestral profiles such that the sum of edge distances is minimum.ConclusionsFor the former problem we give a pseudo-polynomial dynamic programming algorithm that is linear in the profile length, and an integer linear program formulation. For the latter problem we show it is NP-hard and give an integer linear program formulation that scales to practical problem instance sizes. We assess the efficiency and quality of our algorithms on simulated instances.Availability https://github.com/raphael-group/CNT-ILP Electronic supplementary materialThe online version of this article (doi:10.1186/s13015-017-0103-2) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.