Recent advances in next generation sequencing have made it possible to precisely characterize all somatic coding mutations that occur during the development and progression of individual cancers. Here we used these approaches to sequence the genomes (>43-fold coverage) and transcriptomes of an oestrogen-receptor-alpha-positive metastatic lobular breast cancer at depth. We found 32 somatic non-synonymous coding mutations present in the metastasis, and measured the frequency of these somatic mutations in DNA from the primary tumour of the same patient, which arose 9 years earlier. Five of the 32 mutations (in ABCB11, HAUS3, SLC24A4, SNX4 and PALB2) were prevalent in the DNA of the primary tumour removed at diagnosis 9 years earlier, six (in KIF1C, USP28, MYH8, MORC1, KIAA1468 and RNASEH2A) were present at lower frequencies (1-13%), 19 were not detected in the primary tumour, and two were undetermined. The combined analysis of genome and transcriptome data revealed two new RNA-editing events that recode the amino acid sequence of SRP9 and COG3. Taken together, our data show that single nucleotide mutational heterogeneity can be a property of low or intermediate grade primary breast cancers and that significant evolution can occur with disease progression.
Key Points• Complete genome sequence analysis of 40 DLBCL tumors and 13 cell lines reveals novel somatic point mutations, rearrangements, and fusions. • Recurrence of mutations in genes involved in B-cell homing were identified in germinal center B-cell DLBCLs.Diffuse large B-cell lymphoma (DLBCL) is a genetically heterogeneous cancer composed of at least 2 molecular subtypes that differ in gene expression and distribution of mutations. Recently, application of genome/exome sequencing and RNAseq to DLBCL has revealed numerous genes that are recurrent targets of somatic point mutation in this disease. Here we provide a whole-genome-sequencing-based perspective of DLBCL mutational complexity by characterizing 40 de novo DLBCL cases and 13 DLBCL cell lines and combining these data with DNA copy number analysis and RNA-seq from an extended cohort of 96 cases. Our analysis identified widespread genomic rearrangements including evidence for chromothripsis as well as the presence of known and novel fusion transcripts. We uncovered new gene targets of recurrent somatic point mutations and genes that are targeted by focal somatic deletions in this disease. We highlight the recurrence of germinal center B-cell-restricted mutations affecting genes that encode the S1P receptor and 2 small GTPases (GNA13 and GNAI2) that together converge on regulation of B-cell homing. We further analyzed our data to approximate the relative temporal order in which some recurrent mutations were acquired and demonstrate that ongoing acquisition of mutations and intratumoral clonal heterogeneity are common features of DLBCL. This study further improves our understanding of the processes and pathways involved in lymphomagenesis, and some of the pathways mutated here may indicate new avenues for therapeutic intervention. (Blood. 2013;122(7):1256-1265 Introduction Diffuse large B-cell lymphoma (DLBCL) is an aggressive nonHodgkin lymphoma (NHL) with at least 2 molecular subtypes that demonstrate distinct clinical outcomes and gene expression profiles. Because these cancers derive from mature B cells, the mutations that arise in DLBCLs can result from somatic hypermutation that targets a small number of genes, 1 as well as structural rearrangements that arise from double-strand breaks that can be initiated by the B-cell recombination apparatus. In recent years, multiple groups have used massively parallel sequencing (genome/ exome sequencing and RNA-seq) to ascertain the full set of genes targeted by somatic single-nucleotide variants (SNVs) in this disease.2-5 On the basis of these and earlier studies, 6 it is now known that the 2 molecular subtypes also harbor distinct repertoires of somatic copy number alterations (CNAs) and SNVs. In particular, mutations affecting genes involved in B-cell receptor signaling and nuclear factor kB are common in the activated B-cell variety, 7 whereas those affecting certain genes with roles in histone modification may be more common in the germinal center B-cell (GCB) subtype. 2,8,9 These studies have confirmed t...
BackgroundThe mountain pine beetle, Dendroctonus ponderosae Hopkins, is the most serious insect pest of western North American pine forests. A recent outbreak destroyed more than 15 million hectares of pine forests, with major environmental effects on forest health, and economic effects on the forest industry. The outbreak has in part been driven by climate change, and will contribute to increased carbon emissions through decaying forests.ResultsWe developed a genome sequence resource for the mountain pine beetle to better understand the unique aspects of this insect's biology. A draft de novo genome sequence was assembled from paired-end, short-read sequences from an individual field-collected male pupa, and scaffolded using mate-paired, short-read genomic sequences from pooled field-collected pupae, paired-end short-insert whole-transcriptome shotgun sequencing reads of mRNA from adult beetle tissues, and paired-end Sanger EST sequences from various life stages. We describe the cytochrome P450, glutathione S-transferase, and plant cell wall-degrading enzyme gene families important to the survival of the mountain pine beetle in its harsh and nutrient-poor host environment, and examine genome-wide single-nucleotide polymorphism variation. A horizontally transferred bacterial sucrose-6-phosphate hydrolase was evident in the genome, and its tissue-specific transcription suggests a functional role for this beetle.ConclusionsDespite Coleoptera being the largest insect order with over 400,000 described species, including many agricultural and forest pest species, this is only the second genome sequence reported in Coleoptera, and will provide an important resource for the Curculionoidea and other insects.
White spruce (Picea glauca) is a dominant conifer of the boreal forests of North America, and providing genomics resources for this commercially valuable tree will help improve forest management and conservation efforts. Sequencing and assembling the large and highly repetitive spruce genome though pushes the boundaries of the current technology. Here, we describe a whole-genome shotgun sequencing strategy using two Illumina sequencing platforms and an assembly approach using the ABySS software. We report a 20.8 giga base pairs draft genome in 4.9 million scaffolds, with a scaffold N50 of 20 356 bp. We demonstrate how recent improvements in the sequencing technology, especially increasing read lengths and paired end reads from longer fragments have a major impact on the assembly contiguity. We also note that scalable bioinformatics tools are instrumental in providing rapid draft assemblies.Availability: The Picea glauca genome sequencing and assembly data are available through NCBI (Accession#: ALWZ0100000000 PID: PRJNA83435). http://www.ncbi.nlm.nih.gov/bioproject/83435.Contact: ibirol@bcgsc.caSupplementary information: Supplementary data are available at Bioinformatics online.
These authors contributed equally to this work. SUMMARYWhite spruce (Picea glauca), a gymnosperm tree, has been established as one of the models for conifer genomics. We describe the draft genome assemblies of two white spruce genotypes, PG29 and WS77111, innovative tools for the assembly of very large genomes, and the conifer genomics resources developed in this process. The two white spruce genotypes originate from distant geographic regions of western (PG29) and eastern (WS77111) North America, and represent elite trees in two Canadian tree-breeding programs. We present an update (V3 and V4) for a previously reported PG29 V2 draft genome assembly and introduce a second white spruce genome assembly for genotype WS77111. Assemblies of the PG29 and WS77111 genomes confirm the reconstructed white spruce genome size in the 20 Gbp range, and show broad synteny. Using the PG29 V3 assembly and additional white spruce genomics and transcriptomics resources, we performed MAKER-P annotation and meticulous expert annotation of very large gene families of conifer defense metabolism, the terpene synthases and cytochrome P450s. We also comprehensively annotated the white spruce mevalonate, methylerythritol phosphate and phenylpropanoid pathways. These analyses highlighted the large extent of gene and pseudogene duplications in a conifer genome, in particular for genes of secondary (i.e. specialized) metabolism, and the potential for gain and loss of function for defense and adaptation.
In western North America, the current outbreak of the mountain pine beetle (MPB) and its microbial associates has destroyed wide areas of lodgepole pine forest, including more than 16 million hectares in British Columbia. Grosmannia clavigera ( Gc ), a critical component of the outbreak, is a symbiont of the MPB and a pathogen of pine trees. To better understand the interactions between Gc , MPB, and lodgepole pine hosts, we sequenced the ∼30-Mb Gc genome and assembled it into 18 supercontigs. We predict 8,314 protein-coding genes, and support the gene models with proteome, expressed sequence tag, and RNA-seq data. We establish that Gc is heterothallic, and report evidence for repeat-induced point mutation. We report insights, from genome and transcriptome analyses, into how Gc tolerates conifer-defense chemicals, including oleoresin terpenoids, as they colonize a host tree. RNA-seq data indicate that terpenoids induce a substantial antimicrobial stress in Gc , and suggest that the fungus may detoxify these chemicals by using them as a carbon source. Terpenoid treatment strongly activated a ∼100-kb region of the Gc genome that contains a set of genes that may be important for detoxification of these host-defense chemicals. This work is a major step toward understanding the biological interactions between the tripartite MPB/fungus/forest system.
Cryptococcus gattii recently emerged as the causative agent of cryptococcosis in healthy individuals in western North America, despite previous characterization of the fungus as a pathogen in tropical or subtropical regions. As a foundation to study the genetics of virulence in this pathogen, we sequenced the genomes of a strain (WM276) representing the predominant global molecular type (VGI) and a clinical strain (R265) of the major genotype (VGIIa) causing disease in North America. We compared these C. gattii genomes with each other and with the genomes of representative strains of the two varieties of Cryptococcus neoformans that generally cause disease in immunocompromised people. Our comparisons included chromosome alignments, analysis of gene content and gene family evolution, and comparative genome hybridization (CGH). These studies revealed that the genomes of the two representative C. gattii strains (genotypes VGI and VGIIa) are colinear for the majority of chromosomes, with some minor rearrangements. However, multiortholog phylogenetic analysis and an evaluation of gene/sequence conservation support the existence of speciation within the C. gattii complex. More extensive chromosome rearrangements were observed upon comparison of the C. gattii and the C. neoformans genomes. Finally, CGH revealed considerable variation in clinical and environmental isolates as well as changes in chromosome copy numbers in C. gattii isolates displaying fluconazole heteroresistance.IMPORTANCE Isolates of Cryptococcus gattii are currently causing an outbreak of cryptococcosis in western North America, and most of the cases occurred in the absence of coinfection with HIV. This pattern is therefore in stark contrast to the current global burden of one million annual cases of cryptococcosis, caused by the related species Cryptococcus neoformans, in the HIV/AIDS population. The genome sequences of two outbreak-associated major genotypes of C. gattii reported here provide insights into genome variation within and between cryptococcal species. These sequences also provide a resource to further evaluate the epidemiology of cryptococcal disease and to evaluate the role of pathogen genes in the differential interactions of C. gattii and C. neoformans with immunocompromised and immunocompetent hosts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.