Despite the power of massively parallel sequencing platforms, a drawback is the short length of the sequence reads produced. We demonstrate that short reads can be locally assembled into longer contigs using paired-end sequencing of restriction-site associated DNA (RAD-PE) fragments. We use this RAD-PE contig approach to identify single nucleotide polymorphisms (SNPs) and determine haplotype structure in threespine stickleback and to sequence E. coli and stickleback genomic DNA with overlapping contigs of several hundred nucleotides. We also demonstrate that adding a circularization step allows the local assembly of contigs up to 5 kilobases (kb) in length. The ease of assembly and accuracy of the individual contigs produced from each RAD site sequence suggests RAD-PE sequencing is a useful way to convert genome-wide short reads into individually-assembled sequences hundreds or thousands of nucleotides long.
BackgroundPolymorphic loci exist throughout the genomes of a population and provide the raw genetic material needed for a species to adapt to changes in the environment. The minor allele frequencies of rare Single Nucleotide Polymorphisms (SNPs) within a population have been difficult to track with Next-Generation Sequencing (NGS), due to the high error rate of standard methods such as Illumina sequencing.ResultsWe have developed a wet-lab protocol and variant-calling method that identifies both sequencing and PCR errors, called Paired-End Low Error Sequencing (PELE-Seq). To test the specificity and sensitivity of the PELE-Seq method, we sequenced control E. coli DNA libraries containing known rare alleles present at frequencies ranging from 0.2–0.4 % of the total reads. PELE-Seq had higher specificity and sensitivity than standard libraries. We then used PELE-Seq to characterize rare alleles in a Caenorhabditis remanei nematode worm population before and after laboratory adaptation, and found that minor and rare alleles can undergo large changes in frequency during lab-adaptation.ConclusionWe have developed a method of rare allele detection that mitigates both sequencing and PCR errors, called PELE-Seq. PELE-Seq was evaluated using control E. coli populations and was then used to compare a wild C. remanei population to a lab-adapted population. The PELE-Seq method is ideal for investigating the dynamics of rare alleles in a broad range of reduced-representation sequencing methods, including targeted amplicon sequencing, RAD-Seq, ddRAD, and GBS. PELE-Seq is also well-suited for whole genome sequencing of mitochondria and viruses, and for high-throughput rare mutation screens.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-016-2669-3) contains supplementary material, which is available to authorized users.
Here we present a genome-wide method for de novo identification of enhancer regions and apply it to find enhancers that have increased activity after hypoxia. The method links fragmented genomic DNA to the transcription of randomer molecule identifiers and measures the functional enhancer activity of the library by massively parallel sequencing.We transfected a Drosophila melanogaster library into S2 cells in normoxia and hypoxia, and assayed 4,599,881 genomic DNA fragments in parallel. The locations of the enhancer regions strongly correlate with genes up-regulated after hypoxia and previously described enhancers. Novel enhancer regions were identified and integrated with RNAseq data and transcription factor motifs to describe the hypoxic response on a genome-wide basis as a complex regulatory network involving multiple stress-response pathways. This work provides a novel method for high-throughput assay of enhancer activity and the genomescale identification of hypoxia-activated enhancers in Drosophila. Here we present a genome-wide method for de novo identification of enhancer regions and apply 18 it to find enhancers that have increased activity after hypoxia. The method links fragmented 19 genomic DNA to the transcription of randomer molecule identifiers and measures the functional 20 enhancer activity of the library by massively parallel sequencing. We transfected a Drosophila 21 melanogaster library into S2 cells (contributed by Ken Prehoda lab, University of Oregon) in 22 normoxia and hypoxia, and assayed 4,599,881 genomic DNA fragments in parallel. The 23 locations of the enhancer regions strongly correlate with genes up-regulated after hypoxia and 24 previously described enhancers. Novel enhancer regions were identified and integrated with 25 RNAseq data and transcription factor motifs to describe the hypoxic response on a genome-wide 26 basis as a complex regulatory network involving multiple stress-response pathways. This work 27 provides a novel method for high-throughput assay of enhancer activity and the genome-scale 28 identification of hypoxia-activated enhancers in Drosophila. Gene expression is differently regulated in different cell types and in response to changes to 33 environmental conditions. This regulation is achieved in part by the activity of transcriptional 34 enhancers 1-5 , specific DNA sequences that bind transcription factors to control the rate of 35 transcription initiated at nearby promoters. Even for relatively simple processes, such as the 36 acute response to changes in oxygen availability, the identification and characterization of the 37 enhancers used to shift the network of gene expression to a new mode remains limited. 43 recently other transcription factors have been implicated in the hypoxic response in a complex 44 network of regulatory events. For example, the immunity response transcription factor NF-ΚB is 45 also activated by hypoxia and regulates the transcription of HIF-1 9,10 , while HIF-1 appears to 46 play a reciprocal role in the regulation of NF-kB target...
A critical requirement for a systems-level understanding of complex biological processes such as aging is the ability to directly characterize interactions between cells and tissues within a multicellular organism. C. elegans nematodes harboring mutations in the insulinlike receptor daf-2 exhibit dramatically-increased lifespans. To identify tissue-specific biochemical mechanisms regulating aging plasticity, we single-cell sequenced 3'-mRNA libraries generated from seven populations of whole day-one adult wild-type and daf-2 -/-
Background: The loss of a single copy of adenomatous polyposis coli (Apc) in leucine-rich repeats and immunoglobulin-like domains 1 (Lrig1)-expressing colonic progenitor cells induces rapid growth of adenomas in mice with high penetrance and multiplicity. The tumors lack functional APC, and a genetic loss of heterozygosity of Apc was previously observed. Methods: To identify genomic features of early tumorigenesis, and to profile intertumoral genetic heterogeneity, tumor exome DNA (n = 9 tumors) and mRNA (n = 5 tumors) sequences were compared with matched nontumoral colon tissue. Putative somatic mutations were called after stringent variant filtering. Somatic signatures of mutational processes were determined and splicing patterns were observed. Results: The adenomas were found to be genetically heterogeneous and unexpectedly hypermutated, displaying a strong bias toward G:C > A:T mutations. A genetic loss of heterozygosity of Apc was not observed, however, an epigenetic loss of heterozygosity was apparent in the tumor transcriptomes. Complex splicing patterns characterized by a loss of intron retention were observed uniformly across tumors. Conclusion: This study demonstrates that early tumors originating from intestinal stem cells with reduced Lrig1 and Apc expression are highly mutated and genetically heterogeneous, with an inflammation-associated mutational signature and complex splicing patterns that are uniform across tumors.
Here we present a genome-wide method for de novo identification of enhancer regions. This approach enables massively parallel empirical investigation of DNA sequences that mediate transcriptional activation and provides a platform for discovery of regulatory modules capable of driving context-specific gene expression. The method links fragmented genomic DNA to the transcription of randomer molecule identifiers and measures the functional enhancer activity of the library by massively parallel sequencing. We transfected a Drosophila melanogaster library into S2 cells in normoxia and hypoxia, and assayed 4,599,881 genomic DNA fragments in parallel. The locations of the enhancer regions strongly correlate with genes up-regulated after hypoxia and previously described enhancers. Novel enhancer regions were identified and integrated with RNAseq data and transcription factor motifs to describe the hypoxic response on a genome-wide basis as a complex regulatory network involving multiple stress-response pathways. This work provides a novel method for high-throughput assay of enhancer activity and the genome-scale identification of 31 hypoxia-activated enhancers in Drosophila.
Here we present a genome-wide method for de novo identification of enhancer regions and apply it to find enhancers that have increased activity after hypoxia. The method links fragmented genomic DNA to the transcription of randomer molecule identifiers and measures the functional enhancer activity of the library by massively parallel sequencing. We transfected a Drosophila melanogaster library into S2 cells in normoxia and hypoxia, and assayed 4,599,881 genomic DNA fragments in parallel. The locations of the enhancer regions strongly correlate with genes up-regulated after hypoxia and previously described enhancers. Novel enhancer regions were identified and integrated with RNAseq data and transcription factor motifs to describe the hypoxic response on a genome-wide basis as a complex regulatory network involving multiple stress-response pathways. This work provides a novel method for high-throughput assay of enhancer activity and the genome-scale identification of hypoxia-activated enhancers in Drosophila.
Murine colonic adenomas induced by the loss of a single copy of the tumor suppressor gene Apc in Lrig1 +/expressing progenitor cells grow rapidly, with high penetrance and tumor multiplicity. This study investigates the prevalence of intertumoral genetic heterogeneity and phenotypic variation across tumors, and attempts to identify the genomic cause of the unusual phenotype. Adult Lrig1-CreERT2/+; Apc-flox/+ mice were intraperitoneal injected with 2mg tamoxifen for 3 consecutive days which induced highly penetrant, dysplastic adenomas in the distal colon ~100 days later. Whole tumors (n=14) from 8 mice were excised and tumor exome DNA and mRNA were sequenced. Somatic mutations present in the tumor exome DNA were compared with adjacent normal colon (n=9 tumors from 3 mice). Putative somatic mutations were called after stringent filtering using SeuratSomatic, a Genome Analysis Toolkit software module. RNA-Seq was performed on tumor mRNA (n=5 tumors from 5 mice) compared to wildtype colon (n=3). Differential gene expression was profiled using the R package DESeq2. Copy number variations and splicing defects were assessed using custom tracks on the UCSC genome browser.Adenomas resulting from the loss of Apc in Lrig1 +/--expressing colonic progenitor cells are genetically heterogeneous and hypermutated, containing ~25-30 high-quality somatic mutations per megabase. A loss of heterozygosity of Apc was not observed in the tumor genomes, however evidence of an epigenetic loss of heterozygosity was readily apparent in the tumor transcriptome. The tumors display a strong bias toward G: C > A: T point mutations, which are a signature of guanine adducts, associated with tobacco tar and H. pylori infections. Putative tumordriving mutations were detected and thousands of differentially expressed genes were identified including several UDP glucuronosyltransferases. Abnormal splicing patterns characterized by a loss of intron retention were detected in several RNA-binding genes throughout the tumor transcriptome. The widespread defects in gene expression, genomic stability, and splicing patterns implies that an early epigenetic loss of Apc in Lrig1 +/--expressing progenitor cells causes a rapid formation of guanine adducts and a corresponding accumulation of mutation C>A point mutations. This study demonstrates that randomly-appearing oncogenic mutations can become fixed into a latent genomic reservoir prior to advanced disease.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.