We report an improved draft nucleotide sequence of the 2.3-gigabase genome of maize, an important crop plant and model for biological research. Over 32,000 genes were predicted, of which 99.8% were placed on reference chromosomes. Nearly 85% of the genome is composed of hundreds of families of transposable elements, dispersed nonuniformly across the genome. These were responsible for the capture and amplification of numerous gene fragments and affect the composition, sizes, and positions of centromeres. We also report on the correlation of methylation-poor regions with Mu transposon insertions and recombination, and copy number variants with insertions and/or deletions, as well as how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state. These analyses inform and set the stage for further investigations to improve our understanding of the domestication and agricultural improvements of maize.
Retrotransposons are mobile genetic elements that employ a germ line “copy-and-paste” mechanism to spread throughout metazoan genomes1. At least 50% of the human genome is derived from retrotransposons, with three active families (L1, Alu and SVA) associated with insertional mutagenesis and disease2-3. Epigenetic and post-transcriptional suppression block retrotransposition in somatic cells4-5, excluding early embryo development and some malignancies6-7. Recent reports of L1 expression8-9 and copy number variation10-11 (CNV) in the human brain suggest L1 mobilization may also occur during later development. However, the corresponding integration sites have not been mapped. Here we apply a high-throughput method to identify numerous L1, Alu and SVA germ line mutations, as well as 7,743 putative somatic L1 insertions in the hippocampus and caudate nucleus of three individuals. Surprisingly, we also found 13,692 and 1,350 somatic Alu and SVA insertions, respectively. Our results demonstrate that retrotransposons mobilize to protein-coding genes differentially expressed and active in the brain. Thus, somatic genome mosaicism driven by retrotransposition may reshape the genetic circuitry that underpins normal and abnormal neurobiological processes.
Following the domestication of maize over the past ∼10,000 years, breeders have exploited the extensive genetic diversity of this species to mold its phenotype to meet human needs. The extent of structural variation, including copy number variation (CNV) and presence/absence variation (PAV), which are thought to contribute to the extraordinary phenotypic diversity and plasticity of this important crop, have not been elucidated. Whole-genome, array-based, comparative genomic hybridization (CGH) revealed a level of structural diversity between the inbred lines B73 and Mo17 that is unprecedented among higher eukaryotes. A detailed analysis of altered segments of DNA conservatively estimates that there are several hundred CNV sequences among the two genotypes, as well as several thousand PAV sequences that are present in B73 but not Mo17. Haplotype-specific PAVs contain hundreds of single-copy, expressed genes that may contribute to heterosis and to the extraordinary phenotypic diversity of this important crop.
Altering cytosine methylation by genetic means leads to a variety of developmental defects in mice, plants and fungi. Deregulation of cytosine methylation also has a role in human carcinogenesis. In some cases, these defects have been tied to the inheritance of epigenetic alterations (such as chromatin imprints and DNA methylation patterns) that do not involve changes in DNA sequence. Using a forward genetic screen, we identified a gene (DDM1, decrease in DNA methylation) from the flowering plant Arabidopsis thaliana required to maintain normal cytosine methylation patterns. Additional ddm1 alleles (som4, 5, 6, 7, 8) were isolated in a selection for mutations that relieved transgene silencing (E.J.R., unpublished data). Loss of DDM1 function causes a 70% reduction of genomic cytosine methylation, with most of the immediate hypomethylation occurring in repeated sequences. In contrast, many low-copy sequences initially retain their methylation in ddm1 homozygotes, but lose methylation over time as the mutants are propagated through multiple generations by self-pollination. The progressive effect of ddm1 mutations on low-copy sequence methylation suggests that ddm1 mutations compromise the efficiency of methylation of newly incorporated cytosines after DNA replication. In parallel with the slow decay of methylation during inbreeding, ddm1 mutants accumulate heritable alterations (mutations or stable epialleles) at dispersed sites in the genome that lead to morphological abnormalities. Here we report that DDM1 encodes a SWI2/SNF2-like protein, implicating chromatin remodelling as an important process for maintenance of DNA methylation and genome integrity.
Transcriptomic analyses have revealed an unexpected complexity to the human transcriptome, whose breadth and depth exceeds current RNA sequencing capability1–4. Using tiling arrays to target and sequence select portions of the transcriptome, we identify and characterize unannotated transcripts whose rare or transient expression is below the detection limits of conventional sequencing approaches. We use the unprecedented depth of coverage afforded by this technique to reach the deepest limits of the human transcriptome, exposing widespread, regulated and remarkably complex noncoding transcription in intergenic regions, as well as unannotated exons and splicing patterns in even intensively studied protein-coding loci such as p53 and HOX. The data also show that intermittent sequenced reads observed in conventional RNA sequencing data sets, previously dismissed as noise, are in fact indicative of unassembled rare transcripts. Collectively, these results reveal the range, depth and complexity of a human transcriptome that is far from fully characterized.
Transposable elements (TEs) are powerful motors of genome evolution yet a comprehensive assessment of recent transposition activity at the species level is lacking for most organisms. Here, using genome sequencing data for 211 Arabidopsis thaliana accessions taken from across the globe, we identify thousands of recent transposition events involving half of the 326 TE families annotated in this plant species. We further show that the composition and activity of the 'mobilome' vary extensively between accessions in relation to climate and genetic factors. Moreover, TEs insert equally throughout the genome and are rapidly purged by natural selection from gene-rich regions because they frequently affect genes, in multiple ways. Remarkably, loci controlling adaptive responses to the environment are the most frequent transposition targets observed. These findings demonstrate the pervasive, species-wide impact that a rich mobilome can have and the importance of transposition as a recurrent generator of large-effect alleles.DOI: http://dx.doi.org/10.7554/eLife.15716.001
This study was originally conceived to test in a rigorous way the specificity of three major approaches to high-throughput array-based DNA methylation analysis: (1) MeDIP, or methylated DNA immunoprecipitation, an example of antibody-mediated methyl-specific fractionation; (2) HELP, or HpaII tiny fragment enrichment by ligation-mediated PCR, an example of differential amplification of methylated DNA; and (3) fractionation by McrBC, an enzyme that cuts most methylated DNA. These results were validated using 1466 Illumina methylation probes on the GoldenGate methylation assay and further resolved discrepancies among the methods through quantitative methylation pyrosequencing analysis. While all three methods provide useful information, there were significant limitations to each, specifically bias toward CpG islands in MeDIP, relatively incomplete coverage in HELP, and location imprecision in McrBC. However, we found that with an original array design strategy using tiling arrays and statistical procedures that average information from neighboring genomic locations, much improved specificity and sensitivity could be achieved, e.g., ∼100% sensitivity at 90% specificity with McrBC. We term this approach "comprehensive high-throughput arrays for relative methylation" (CHARM). While this approach was applied to McrBC analysis, the array design and computational algorithms are fractionation method-independent and make this a simple, general, relatively inexpensive tool suitable for genome-wide analysis, and in which individual samples can be assayed reliably at very high density, allowing locus-level genome-wide epigenetic discrimination of individuals, not just groups of samples. Furthermore, unlike the other approaches, CHARM is highly quantitative, a substantial advantage in application to the study of human disease.
A number of aberrant morphological phenotypes were noted during propagation of the Arabidopsis thaliana DNA hypomethylation mutant, ddm1, by repeated self-pollination. Onset of a spectrum of morphological abnormalities, including defects in leaf structure, flowering time, and flower structure, was strictly associated with the ddm1 mutations. The morphological phenotypes arose at a high frequency in selfed ddm1 mutant lines and some phenotypes became progressively more severe in advancing generations. The transmission of two common morphological trait syndromes in genetic crosses demonstrated that the phenotypes are caused by heritable lesions that develop in ddm1 mutant backgrounds. Loss of cytosine methylation in specific genomic sequences during the selfing regime was noted in the ddm1 mutants. Potential mechanisms for formation of the lesions underlying the morphological abnormalities are discussed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.