Complete and accurate reference genomes and annotations provide fundamental tools for characterization of genetic and functional variation 1 . These resources facilitate the determination of biological processes and support translation of research findings into improved and sustainable agricultural technologies. Many reference genomes for crop plants have been generated over the past decade, but these genomes are often fragmented and missing complex repeat regions 2 . Here we report the assembly and annotation of a reference genome of maize, a genetic and agricultural model species, using single-molecule real-time sequencing and high-resolution optical mapping. Relative to the previous reference genome 3 , our assembly features a 52-fold increase in contig length and notable improvements in the assembly of intergenic spaces and centromeres. Characterization of the repetitive portion of the genome revealed more than 130,000 intact transposable elements, allowing us to identify transposable element lineage expansions that are unique to maize. Gene annotations were updated using 111,000 full-length transcripts obtained by single-molecule real-time sequencing 4 . In addition, comparative optical mapping of two other inbred maize lines revealed a prevalence of deletions in regions of low gene density and maize lineage-specific genes.
To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before the teleost genome duplication (TGD). The slowly evolving gar genome conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization, and development (e.g., Hox, ParaHox, and miRNA genes). Numerous conserved non-coding elements (CNEs, often cis-regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles of such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses revealed that the sum of expression domains and levels from duplicated teleost genes often approximate patterns and levels of gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes, and the function of human regulatory sequences.
It was a zoological sensation when a living specimen of the coelacanth was first discovered in 1938, as this lineage of lobe-finned fish was thought to have gone extinct 70 million years ago. The modern coelacanth looks remarkably similar to many of its ancient relatives, and its evolutionary proximity to our own fish ancestors provides a glimpse of the fish that first walked on land. Here we report the genome sequence of the African coelacanth, Latimeria chalumnae. Through a phylogenomic analysis, we conclude that the lungfish, and not the coelacanth, is the closest living relative of tetrapods. Coelacanth protein-coding genes are significantly more slowly evolving than those of tetrapods, unlike other genomic features . Analyses of changes in genes and regulatory elements during the vertebrate adaptation to land highlight genes involved in immunity, nitrogen excretion and the development of fins, tail, ear, eye, brain, and olfaction. Functional assays of enhancers involved in the fin-to-limb transition and in the emergence of extra-embryonic tissues demonstrate the importance of the coelacanth genome as a blueprint for understanding tetrapod evolution.
This unit describes how to use the genome annotation and curation tools MAKER and MAKER‐P to annotate protein‐coding and noncoding RNA genes in newly assembled genomes, update/combine legacy annotations in light of new evidence, add quality metrics to annotations from other pipelines, and map existing annotations to a new assembly. MAKER and MAKER‐P can rapidly annotate genomes of any size, and scale to match available computational resources. © 2014 by John Wiley & Sons, Inc.
Lampreys are representatives of an ancient vertebrate lineage that diverged from our own ~500 million years ago. By virtue of this deeply shared ancestry, the sea lamprey (P. marinus) genome is uniquely poised to provide insight into the ancestry of vertebrate genomes and the underlying principles of vertebrate biology. Here, we present the first lamprey whole-genome sequence and assembly. We note challenges faced owing to its high content of repetitive elements and GC bases, as well as the absence of broad-scale sequence information from closely related species. Analyses of the assembly indicate that two whole-genome duplications likely occurred before the divergence of ancestral lamprey and gnathostome lineages. Moreover, the results help define key evolutionary events within vertebrate lineages, including the origin of myelin-associated proteins and the development of appendages. The lamprey genome provides an important resource for reconstructing vertebrate origins and the evolutionary events that have shaped the genomes of extant organisms.
IMPORTANCE Evolutionary medicine may provide insights into human physiology and pathophysiology, including tumor biology. OBJECTIVE To identify mechanisms for cancer resistance in elephants and compare cellular response to DNA damage among elephants, healthy human controls, and cancer-prone patients with Li-Fraumeni syndrome (LFS). DESIGN, SETTING, AND PARTICIPANTS A comprehensive survey of necropsy data was performed across 36 mammalian species to validate cancer resistance in large and long-lived organisms, including elephants (n = 644). The African and Asian elephant genomes were analyzed for potential mechanisms of cancer resistance. Peripheral blood lymphocytes from elephants, healthy human controls, and patients with LFS were tested in vitro in the laboratory for DNA damage response. The study included African and Asian elephants (n = 8), patients with LFS (n = 10), and age-matched human controls (n = 11). Human samples were collected at the University of Utah between June 2014 and July 2015. EXPOSURES Ionizing radiation and doxorubicin. MAIN OUTCOMES AND MEASURES Cancer mortality across species was calculated and compared by body size and life span. The elephant genome was investigated for alterations in cancer-related genes. DNA repair and apoptosis were compared in elephant vs human peripheral blood lymphocytes. RESULTS Across mammals, cancer mortality did not increase with body size and/or maximum life span (eg, for rock hyrax, 1% [95%CI, 0%–5%]; African wild dog, 8%[95%CI, 0%–16%]; lion, 2%[95%CI, 0% –7%]). Despite their large body size and long life span, elephants remain cancer resistant, with an estimated cancer mortality of 4.81% (95%CI, 3.14%–6.49%), compared with humans, who have 11% to 25%cancer mortality. While humans have 1 copy (2 alleles) of TP53, African elephants have at least 20 copies (40 alleles), including 19 retrogenes (38 alleles) with evidence of transcriptional activity measured by reverse transcription polymerase chain reaction. In response to DNA damage, elephant lymphocytes underwent p53-mediated apoptosis at higher rates than human lymphocytes proportional to TP53 status (ionizing radiation exposure: patients with LFS, 2.71% [95%CI, 1.93%–3.48%] vs human controls, 7.17%[95%CI, 5.91%–8.44%] vs elephants, 14.64%[95%CI, 10.91%–18.37%]; P < .001; doxorubicin exposure: human controls, 8.10% [95%CI, 6.55%–9.66%] vs elephants, 24.77%[95%CI, 23.0%–26.53%]; P < .001). CONCLUSIONS AND RELEVANCE Compared with other mammalian species, elephants appeared to have a lower-than-expected rate of cancer, potentially related to multiple copies of TP53. Compared with human cells, elephant cells demonstrated increased apoptotic response following DNA damage. These findings, if replicated, could represent an evolutionary-based approach for understanding mechanisms related to cancer suppression.
Unicellular marine algae have promise for providing sustainable and scalable biofuel feedstocks, although no single species has emerged as a preferred organism. Moreover, adequate molecular and genetic resources prerequisite for the rational engineering of marine algal feedstocks are lacking for most candidate species. Heterokonts of the genus Nannochloropsis naturally have high cellular oil content and are already in use for industrial production of high-value lipid products. First success in applying reverse genetics by targeted gene replacement makes Nannochloropsis oceanica an attractive model to investigate the cell and molecular biology and biochemistry of this fascinating organism group. Here we present the assembly of the 28.7 Mb genome of N. oceanica CCMP1779. RNA sequencing data from nitrogen-replete and nitrogen-depleted growth conditions support a total of 11,973 genes, of which in addition to automatic annotation some were manually inspected to predict the biochemical repertoire for this organism. Among others, more than 100 genes putatively related to lipid metabolism, 114 predicted transcription factors, and 109 transcriptional regulators were annotated. Comparison of the N. oceanica CCMP1779 gene repertoire with the recently published N. gaditana genome identified 2,649 genes likely specific to N. oceanica CCMP1779. Many of these N. oceanica–specific genes have putative orthologs in other species or are supported by transcriptional evidence. However, because similarity-based annotations are limited, functions of most of these species-specific genes remain unknown. Aside from the genome sequence and its analysis, protocols for the transformation of N. oceanica CCMP1779 are provided. The availability of genomic and transcriptomic data for Nannochloropsis oceanica CCMP1779, along with efficient transformation protocols, provides a blueprint for future detailed gene functional analysis and genetic engineering of Nannochloropsis species by a growing academic community focused on this genus.
We have optimized and extended the widely used annotation engine MAKER in order to better support plant genome annotation efforts. New features include better parallelization for large repeat-rich plant genomes, noncoding RNA annotation capabilities, and support for pseudogene identification. We have benchmarked the resulting software tool kit, MAKER-P, using the Arabidopsis (Arabidopsis thaliana) and maize (Zea mays) genomes. Here, we demonstrate the ability of the MAKER-P tool kit to automatically update, extend, and revise the Arabidopsis annotations in light of newly available data and to annotate pseudogenes and noncoding RNAs absent from The Arabidopsis Informatics Resource 10 build. Our results demonstrate that MAKER-P can be used to manage and improve the annotations of even Arabidopsis, perhaps the best-annotated plant genome. We have also installed and benchmarked MAKER-P on the Texas Advanced Computing Center. We show that this public resource can de novo annotate the entire Arabidopsis and maize genomes in less than 3 h and produce annotations of comparable quality to those of the current The Arabidopsis Information Resource 10 and maize V2 annotation builds.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.