on behalf of the 100,000 Genomes Project Purpose: Fresh-frozen (FF) tissue is the optimal source of DNA for whole-genome sequencing (WGS) of cancer patients. However, it is not always available, limiting the widespread application of WGS in clinical practice. We explored the viability of using formalin-fixed, paraffin-embedded (FFPE) tissues, available routinely for cancer patients, as a source of DNA for clinical WGS. Methods:We conducted a prospective study using DNAs from matched FF, FFPE, and peripheral blood germ-line specimens collected from 52 cancer patients (156 samples) following routine diagnostic protocols. We compared somatic variants detected in FFPE and matching FF samples. Results:We found the single-nucleotide variant agreement reached 71% across the genome and somatic copy-number alterations (CNAs) detection from FFPE samples was suboptimal (0.44 median correlation with FF) due to nonuniform coverage. CNA detection was improved significantly with lower reverse crosslinking temperature in FFPE DNA extraction (80°C or 65°C depending on the methods). Our final data showed somatic variant detection from FFPE for clinical decision making is possible. We detected 98% of clinically actionable variants (including 30/31 CNAs). Conclusion:We present the first prospective WGS study of cancer patients using FFPE specimens collected in a routine clinical environment proving WGS can be applied in the clinic.Genet Med advance online publication 1 February 2018
The transformation of chronic lymphocytic leukemia (CLL) to high-grade B-cell lymphoma is known as Richter's Syndrome (RS) and it is a rare event with dismal prognosis. In this study, we conducted whole genome sequencing (WGS) of paired circulating CLL (PB-CLL) and RS biopsies (tissue-RS) from 17 clinical trial (CHOP-O) patients. We found that tissue-RS was enriched for mutations in poor-risk CLL drivers and genes in the DNA damage response (DDR) pathway. In addition, we identified genomic aberrations not previously implicated in RS, including the protein tyrosine phosphatase receptor (PTPRD) and tumour necrosis factor receptor associated factor three (TRAF3). In the non-coding genome, we discovered AID-related and unrelated kataegis in tissue-RS affecting regulatory regions of key immune regulatory genes. These include BTG2, CXCR4, NFATC1, PAX5, NOTCH-1, SLC44A5, FCRL3, SELL, TNIP2 andTRIM13. Furthermore, differences between the global mutation signatures of pairs of PB-CLL and tissue-RS samples implicate DDR as the dominant mechanism driving transformation. Pathway-based clonal de-convolution analysis showed that genes in the MAPK and DDR pathways demonstrate high clonal expansion probability. Direct comparison of nodal-CLL and tissue-RS pairs from an independent cohort confirmed differential expression of the same pathways by RNA expression profiling. Our integrated analysis of WGS and RNA expression data significantly extends previous targeted approaches, which were limited by the lack of germline samples, and it facilitates the identification of novel genomic correlates implicated in RS transformation, which could be targeted therapeutically. Our results inform the future selection of investigative agents for a UK clinical platform study (NCT03899337).
The immunoglobulin heavy-chain variable region gene (IgHV) mutational status is considered the gold standard of prognostication in chronic lymphocytic leukemia (CLL) and is currently determined by Sanger sequencing that allows the analysis of the major clone. Using next-generation sequencing (NGS), we sequenced the IgHV gene from two independent cohorts: (A) 270 consecutive patient samples obtained at diagnosis and (B) 227 patients from the UK ARCTIC-AdMIRe clinical trials. Using complementary DNA from purified CD19+CD5+ cells, we demonstrate the presence of multiple rearrangements in independent experiments and showed that 24.4% of CLL patients express multiple productive clonally unrelated IgHV rearrangements. On the basis of IgHV-NGS subclonal profiles, we defined five different categories: patients with (a) multiple hypermutated (M) clones, (b) 1 M clone, (c) a mix of M-unmutated (UM) clones, (d) 1 UM clone and (e) multiple UM clones. In population A, IgHV-NGS classification stratified patients into five different subgroups with median treatment-free survival (TFS) of >280(a), 131(b), 94(c), 29(d), 15(e) months (P<0.0001) and a median OS of >397(a), 292(b), 196(c), 137(d) and 100(e) months (P<0.0001). In population B, the poor prognosis of multiple UM patients was confirmed with a median TFS of 2 months (P=0.0038). In conclusion, IgHV-NGS highlighted one quarter of CLL patients with multiple productive IgHV subclones and improves disease stratification and raises important questions concerning the pre-leukemic cellular origin of CLL.
We present updates to the SUPERFAMILY 1.75 (http://supfam.org) online resource and protein sequence collection. The hidden Markov model library that provides sequence homology to SCOP structural domains remains unchanged at version 1.75. In the last 4 years SUPERFAMILY has more than doubled its holding of curated complete proteomes over all cellular life, from 1400 proteomes reported previously in 2010 up to 3258 at present. Outside of the main sequence collection, SUPERFAMILY continues to provide domain annotation for sequences provided by other resources such as: UniProt, Ensembl, PDB, much of JGI Phytozome and selected subcollections of NCBI RefSeq. Despite this growth in data volume, SUPERFAMILY now provides users with an expanded and daily updated phylogenetic tree of life (sTOL). This tree is built with genomic-scale domain annotation data as before, but constantly updated when new species are introduced to the sequence library. Our Gene Ontology and other functional and phenotypic annotations previously reported have stood up to critical assessment by the function prediction community. We have now introduced these data in an integrated manner online at the level of an individual sequence, and—in the case of whole genomes—with enrichment analysis against a taxonomically defined background.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.