Since its initial release in 2000, the human reference genome has covered only the euchromatic fraction of the genome, leaving important heterochromatic regions unfinished. Addressing the remaining 8% of the genome, the Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion–base pair sequence of a human genome, T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors in the prior references, and introduces nearly 200 million base pairs of sequence containing 1956 gene predictions, 99 of which are predicted to be protein coding. The completed regions include all centromeric satellite arrays, recent segmental duplications, and the short arms of all five acrocentric chromosomes, unlocking these complex regions of the genome to variational and functional studies.
Dfam is an open access database of repetitive DNA families, sequence models, and genome annotations. The 3.0–3.3 releases of Dfam (https://dfam.org) represent an evolution from a proof-of-principle collection of transposable element families in model organisms into a community resource for a broad range of species, and for both curated and uncurated datasets. In addition, releases since Dfam 3.0 provide auxiliary consensus sequence models, transposable element protein alignments, and a formalized classification system to support the growing diversity of organisms represented in the resource. The latest release includes 266,740 new de novo generated transposable element families from 336 species contributed by the EBI. This expansion demonstrates the utility of many of Dfam’s new features and provides insight into the long term challenges ahead for improving de novo generated transposable element datasets.
The rhesus macaque (Macaca mulatta) is the most widely studied nonhuman primate (NHP) in biomedical research. We present an updated reference genome assembly (Mmul_10, contig N50 = 46 Mbp) that increases the sequence contiguity 120-fold and annotate it using 6.5 million full-length transcripts, thus improving our understanding of gene content, isoform diversity, and repeat organization. With the improved assembly of segmental duplications, we discovered new lineage-specific genes and expanded gene families that are potentially informative in studies of evolution and disease susceptibility. Whole-genome sequencing (WGS) data from 853 rhesus macaques identified 85.7 million single-nucleotide variants (SNVs) and 10.5 million indel variants, including potentially damaging variants in genes associated with human autism and developmental delay, providing a framework for developing noninvasive NHP models of human disease.
Mobile elements and repetitive genomic regions are sources of lineage-specific genomic innovation and uniquely fingerprint individual genomes. Comprehensive analyses of such repeat elements, including those found in more complex regions of the genome, require a complete, linear genome assembly. We present a de novo repeat discovery and annotation of the T2T-CHM13 human reference genome. We identified previously unknown satellite arrays, expanded the catalog of variants and families for repeats and mobile elements, characterized classes of complex composite repeats, and located retroelement transduction events. We detected nascent transcription and delineated CpG methylation profiles to define the structure of transcriptionally active retroelements in humans, including those in centromeres. These data expand our insight into the diversity, distribution, and evolution of repetitive regions that have shaped the human genome.
The 3.0-3.2 releases of Dfam (https://dfam.org) represent an evolution from a proof-of-principle collection of transposable element families in model organisms into a community resource for a broad range of species and for both curated and uncurated datasets. In addition, releases since Dfam 3.0 provide auxiliary consensus sequence models, transposable element protein alignments, and a formalized classification system to support the growing diversity of organisms represented in the resource. The latest release includes 266,740 new de novo generated transposable element families from 336 species contributed by the EBI. This expansion demonstrates the utility of many of Dfam’s new features and provides insight into the long term challenges ahead for improving de novo generated transposable element datasets.
The goal of the 1000 Genomes Consortium is to characterize human genome structural variation (SV), including forms of copy number variations such as deletions, duplications, and insertions. Mobile element insertions, particularly Alu elements, are major contributors to genomic SV among humans. During the pilot phase of the project we experimentally validated 645 (611 intergenic and 34 exon targeted) polymorphic “young” Alu insertion events, absent from the human reference genome. Here, we report high resolution sequencing of 343 (322 unique) recent Alu insertion events, along with their respective target site duplications, precise genomic breakpoint coordinates, subfamily assignment, percent divergence, and estimated A-rich tail lengths. All the sequenced Alu loci were derived from the AluY lineage with no evidence of retrotransposition activity involving older Alu families (e.g., AluJ and AluS). AluYa5 is currently the most active Alu subfamily in the human lineage, followed by AluYb8, and many others including three newly identified subfamilies we have termed AluYb7a3, AluYb8b1, and AluYa4a1. This report provides the structural details of 322 unique Alu variants from individual human genomes collectively adding about 100 kb of genomic variation. Many Alu subfamilies are currently active in human populations, including a surprising level of AluY retrotransposition. Human Alu subfamilies exhibit continuous evolution with potential drivers sprouting new Alu lineages.
In plants, RNA-directed DNA methylation (RdDM) employs small RNAs to target enzymes that methylate cytosine residues. Cytosine methylation and dimethylation of histone 3 lysine 9 (H3K9me2) are often linked. Together they condition an epigenetic defense that results in chromatin compaction and transcriptional silencing of transposons and viral chromatin. Canonical RdDM (Pol IV-RdDM), involving RNA polymerases IV and V (Pol IV and Pol V), was believed to be necessary to establish cytosine methylation, which in turn could recruit H3K9 methyltransferases. However, recent studies have revealed that a pathway involving Pol II and RNA-dependent RNA polymerase 6 (RDR6) (RDR6-RdDM) is likely responsible for establishing cytosine methylation at naive loci, while Pol IV-RdDM acts to reinforce and maintain it. We used the geminivirus Beet curly top virus (BCTV) as a model to examine the roles of Pol IV and Pol V in establishing repressive viral chromatin methylation. As geminivirus chromatin is formed de novo in infected cells, these viruses are unique models for processes involved in the establishment of epigenetic marks. We confirm that Pol IV and Pol V are not needed to establish viral DNA methylation but are essential for its amplification. Remarkably, however, both Pol IV and Pol V are required for deposition of H3K9me2 on viral chromatin. Our findings suggest that cytosine methylation alone is not sufficient to trigger de novo deposition of H3K9me2 and further that Pol IV-RdDM is responsible for recruiting H3K9 methyltransferases to viral chromatin. IMPORTANCEIn plants, RNA-directed DNA methylation (RdDM) uses small RNAs to target cytosine methylation, which is often linked to H3K9me2. These epigenetic marks silence transposable elements and DNA virus genomes, but how they are established is not well understood. Canonical RdDM, involving Pol IV and Pol V, was thought to establish cytosine methylation that in turn could recruit H3K9 methyltransferases, but recent studies compel a reevaluation of this view. We used BCTV to investigate the roles of Pol IV and Pol V in chromatin methylation. We found that both are needed to amplify, but not to establish, DNA methylation. However, both are required for deposition of H3K9me2. Our findings suggest that cytosine methylation is not sufficient to recruit H3K9 methyltransferases to naive viral chromatin and further that Pol IV-RdDM is responsible. R epressive chromatin methylation suppresses the expression of transposable elements and DNA viruses and leads to the establishment of transcriptional gene silencing (TGS). Plants employ RNA-directed DNA methylation (RdDM) to target methylation of cytosine residues and use cytosine methylation and associated histone 3 lysine 9 dimethylation (H3K9me2) to silence invasive DNAs, such as transposons and geminiviruses. As this study addresses the roles of RNA polymerases in repressive methylation of geminivirus chromatin, an overview of relevant pathways and their interrelationships is presented (1, 2).In the reference plant Arabido...
Background The rice weevil Sitophilus oryzae is one of the most important agricultural pests, causing extensive damage to cereal in fields and to stored grains. S. oryzae has an intracellular symbiotic relationship (endosymbiosis) with the Gram-negative bacterium Sodalis pierantonius and is a valuable model to decipher host-symbiont molecular interactions. Results We sequenced the Sitophilus oryzae genome using a combination of short and long reads to produce the best assembly for a Curculionidae species to date. We show that S. oryzae has undergone successive bursts of transposable element (TE) amplification, representing 72% of the genome. In addition, we show that many TE families are transcriptionally active, and changes in their expression are associated with insect endosymbiotic state. S. oryzae has undergone a high gene expansion rate, when compared to other beetles. Reconstruction of host-symbiont metabolic networks revealed that, despite its recent association with cereal weevils (30 kyear), S. pierantonius relies on the host for several amino acids and nucleotides to survive and to produce vitamins and essential amino acids required for insect development and cuticle biosynthesis. Conclusions Here we present the genome of an agricultural pest beetle, which may act as a foundation for pest control. In addition, S. oryzae may be a useful model for endosymbiosis, and studying TE evolution and regulation, along with the impact of TEs on eukaryotic genomes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.