A better understanding of the features that define the interplay between cancer cells and immune cells is key to identify new cancer therapies 1 . Yet, focus is often given to those interactions that occur within the primary tumor and its microenvironment, while the role of immune cells during cancer dissemination in patients remains largely uncharacterized 2,3 . Circulating tumor cells (CTCs) are precursors of metastasis in several cancer types [4][5][6] , and are occasionally found within the bloodstream in association with non-malignant cells such as white blood cells (WBCs) 7,8 . The identity and function of these CTC-associated WBCs, as well as the molecular features that define the interaction between WBCs and CTCs are unknown. Here, we achieve the isolation and interrogation of individual CTC-associated WBCs, alongside with corresponding cancer cells within each CTC-WBC cluster, from multiple breast cancer patients and mouse models. Single-cell RNA sequencing reveals a specific pattern of WBCs attached to CTCs, with neutrophils representing the majority of the cases. When comparing the transcriptome profiles of CTCs that were associated to neutrophils with that of CTCs alone, we detect a number of differentially expressed genes that outline cell cycle progression, leading to a higher ability to efficiently seed metastasis. Additionally, we identify cell-cell junction and cytokine-receptor pairs that define CTC-neutrophil clusters, representing key vulnerabilities of the metastatic process. Thus, the association between neutrophils and CTCs fuels cell cycle progression within the bloodstream and expands the metastatic potential of CTCs, providing a rationale for targeting this interaction in breast cancer. 3/28 Main TextCirculating tumor cells (CTCs) are precursors of metastasis in various solid cancers including breast cancer 6 , and are occasionally found in association to white blood cells (WBCs) 7 . The role of CTC-WBC clusters in metastasis development as well as the principles that govern the interplay between CTCs and WBCs during blood-borne metastasis are largely uncharacterized.We first sought to determine the number and composition of CTC-WBC clusters in breast cancer patients and mouse models. We obtained blood samples from 70 patients with invasive breast cancer that discontinued their treatment due to progressive disease, as well as from five different breast cancer mouse models, and we enriched for CTCs using the Parsortix microfluidic device 9 (Extended Data Fig. 1a-e). Live CTCs were stained for cancer-associated cell surface markers EpCAM, HER2, and EGFR or imaged directly for the expression of GFP, as well as labeled for CD45 to identify WBCs (Fig. 1a and Extended Data Fig. 1f). Among 70 patients, 34 (48.6%) had detectable CTCs, with a mean number of 22 CTCs per 7.5ml of blood (Supplementary Tables 1 and 2). While the majority of CTCs were single (88.0%), we also detected CTC clusters (8.6%) and CTC-WBC clusters (3.4%) (Fig. 1b and Extended Data Fig. 1g,h). Similarly, we observed that CTC-...
Reconstructing the evolution of tumors is a key aspect towards the identification of appropriate cancer therapies. The task is challenging because tumors evolve as heterogeneous cell populations. Single-cell sequencing holds the promise of resolving the heterogeneity of tumors; however, it has its own challenges including elevated error rates, allelic drop-out, and uneven coverage. Here, we develop a new approach to mutation detection in individual tumor cells by leveraging the evolutionary relationship among cells. Our method, called SCIΦ, jointly calls mutations in individual cells and estimates the tumor phylogeny among these cells. Employing a Markov Chain Monte Carlo scheme enables us to reliably call mutations in each single cell even in experiments with high drop-out rates and missing data. We show that SCIΦ outperforms existing methods on simulated data and applied it to different real-world datasets, namely a whole exome breast cancer as well as a panel acute lymphoblastic leukemia dataset.
Clonal hematopoiesis (CH) is associated with age and an increased risk of myeloid malignancies, cardiovascular risk, and all-cause mortality. We tested for CH in a setting where hematopoietic stem cells (HSCs) of the same individual are exposed to different degrees of proliferative stress and environments, ie, in long-term survivors of allogeneic hematopoietic stem cell transplantation (allo-HSCT) and their respective related donors (n = 42 donor-recipient pairs). With a median follow-up time since allo-HSCT of 16 years (range, 10-32 years), we found a total of 35 mutations in 23 out of 84 (27.4%) study participants. Ten out of 42 donors (23.8%) and 13 out of 42 recipients (31%) had CH. CH was associated with older donor and recipient age. We identified 5 cases of donor-engrafted CH, with 1 case progressing into myelodysplastic syndrome in both donor and recipient. Four out of 5 cases showed increased clone size in recipients compared with donors. We further characterized the hematopoietic system in individuals with CH as follows: (1) CH was consistently present in myeloid cells but varied in penetrance in B and T cells; (2) colony-forming units (CFUs) revealed clonal evolution or multiple independent clones in individuals with multiple CH mutations; and (3) telomere shortening determined in granulocytes suggested ∼20 years of added proliferative history of HSCs in recipients compared with their donors, with telomere length in CH vs non-CH CFUs showing varying patterns. This study provides insight into the long-term behavior of the same human HSCs and respective CH development under different proliferative conditions.
Motivation: Next-generation sequencing technologies produce unprecedented amounts of data, leading to completely new research fields. One of these is metagenomics, the study of large-size DNA samples containing a multitude of diverse organisms. A key problem in metagenomics is to functionally and taxonomically classify the sequenced DNA, to which end the well-known BLAST program is usually used. But BLAST has dramatic resource requirements at metagenomic scales of data, imposing a high financial or technical burden on the researcher. Multiple attempts have been made to overcome these limitations and present a viable alternative to BLAST.Results: In this work we present Lambda, our own alternative for BLAST in the context of sequence classification. In our tests, Lambda often outperforms the best tools at reproducing BLAST’s results and is the fastest compared with the current state of the art at comparable levels of sensitivity.Availability and implementation: Lambda was implemented in the SeqAn open-source C++ library for sequence analysis and is publicly available for download at http://www.seqan.de/projects/lambda.Contact: hannes.hauswedell@fu-berlin.deSupplementary information: Supplementary data are available at Bioinformatics online.
BackgroundNext-generation sequencing of matched tumor and normal biopsy pairs has become a technology of paramount importance for precision cancer treatment. Sequencing costs have dropped tremendously, allowing the sequencing of the whole exome of tumors for just a fraction of the total treatment costs. However, clinicians and scientists cannot take full advantage of the generated data because the accuracy of analysis pipelines is limited. This particularly concerns the reliable identification of subclonal mutations in a cancer tissue sample with very low frequencies, which may be clinically relevant.ResultsUsing simulations based on kidney tumor data, we compared the performance of nine state-of-the-art variant callers, namely deepSNV, GATK HaplotypeCaller, GATK UnifiedGenotyper, JointSNVMix2, MuTect, SAMtools, SiNVICT, SomaticSniper, and VarScan2. The comparison was done as a function of variant allele frequencies and coverage. Our analysis revealed that deepSNV and JointSNVMix2 perform very well, especially in the low-frequency range. We attributed false positive and false negative calls of the nine tools to specific error sources and assigned them to processing steps of the pipeline. All of these errors can be expected to occur in real data sets. We found that modifying certain steps of the pipeline or parameters of the tools can lead to substantial improvements in performance. Furthermore, a novel integration strategy that combines the ranks of the variants yielded the best performance. More precisely, the rank-combination of deepSNV, JointSNVMix2, MuTect, SiNVICT and VarScan2 reached a sensitivity of 78% when fixing the precision at 90%, and outperformed all individual tools, where the maximum sensitivity was 71% with the same precision.ConclusionsThe choice of well-performing tools for alignment and variant calling is crucial for the correct interpretation of exome sequencing data obtained from mixed samples, and common pipelines are suboptimal. We were able to relate observed substantial differences in performance to the underlying statistical models of the tools, and to pinpoint the error sources of false positive and false negative calls. These findings might inspire new software developments that improve exome sequencing pipelines and further the field of precision cancer treatment.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-1417-7) contains supplementary material, which is available to authorized users.
Determining the composition of viral populations is becoming increasingly important in the field of medical virology. While recently developed computational tools for viral haplotype analysis allow for correcting sequencing errors, they do not always allow for the removal of errors occurring in the upstream experimental protocol, such as PCR errors. Primer IDs (pIDs) are one method to address this problem by harnessing redundant template resampling for error correction. By using a reference mixture of five HIV-1 strains, we show how pIDs can be useful for estimating key experimental parameters, such as the substitution rate of the PCR process and the reverse transcription (RT) error rate. In addition, we introduce a hidden Markov model for determining the recombination rate of the RT PCR process. We found no strong sequence-specific bias in pID abundances (the same RT efficiencies as compared to commonly used short, specific RT primers) and no effects of pIDs on the estimated distribution of the references viruses.
Molecular profiling of tumor biopsies plays an increasingly important role not only in cancer research, but also in the clinical management of cancer patients. Multi-omics approaches hold the promise of improving diagnostics, prognostics and personalized treatment. To deliver on this promise of precision oncology, appropriate bioinformatics methods for managing, integrating and analyzing large and complex data are necessary. Here, we discuss the specific requirements of bioinformatics methods and software that arise in the setting of clinical oncology, owing to a stricter regulatory environment and the need for rapid, highly reproducible and robust procedures. We describe the workflow of a molecular tumor board and the specific bioinformatics support that it requires, from the primary analysis of raw molecular profiling data to the automatic generation of a clinical report and its delivery to decision-making clinical oncologists. Such workflows have to various degrees been implemented in many clinical trials, as well as in molecular tumor boards at specialized cancer centers and university hospitals worldwide. We review these and more recent efforts to include other high-dimensional multi-omics patient profiles into the tumor board, as well as the state of clinical decision support software to translate molecular findings into treatment recommendations.
The SIB Swiss Institute of Bioinformatics (www.isb-sib.ch) provides world-class bioinformatics databases, software tools, services and training to the international life science community in academia and industry. These solutions allow life scientists to turn the exponentially growing amount of data into knowledge. Here, we provide an overview of SIB's resources and competence areas, with a strong focus on curated databases and SIB's most popular and widely used resources. In particular, SIB's Bioinformatics resource portal ExPASy features over 150 resources, including UniProtKB/Swiss-Prot, ENZYME, PROSITE, neXtProt, STRING, UniCarbKB, SugarBindDB, SwissRegulon, EPD, arrayMap, Bgee, SWISS-MODEL Repository, OMA, OrthoDB and other databases, which are briefly described in this article.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.