High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are available for only a few non-microbial species1–4. To address this issue, the international Genome 10K (G10K) consortium5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling highly accurate and nearly complete reference genomes. Here we present lessons learned from generating assemblies for 16 species that represent six major vertebrate lineages. We confirm that long-read sequencing technologies are essential for maximizing genome quality, and that unresolved complex repeats and haplotype heterozygosity are major sources of assembly error when not handled correctly. Our assemblies correct substantial errors, add missing sequence in some of the best historical reference genomes, and reveal biological discoveries. These include the identification of many false gene duplications, increases in gene sizes, chromosome rearrangements that are specific to lineages, a repeated independent chromosome breakpoint in bat genomes, and a canonical GC-rich pattern in protein-coding genes and their regulatory regions. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an international effort to generate high-quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences.
Whole-genome sequencing projects are increasingly populating the tree of life and characterizing biodiversity1–4. Sparse taxon sampling has previously been proposed to confound phylogenetic inference5, and captures only a fraction of the genomic diversity. Here we report a substantial step towards the dense representation of avian phylogenetic and molecular diversity, by analysing 363 genomes from 92.4% of bird families—including 267 newly sequenced genomes produced for phase II of the Bird 10,000 Genomes (B10K) Project. We use this comparative genome dataset in combination with a pipeline that leverages a reference-free whole-genome alignment to identify orthologous regions in greater numbers than has previously been possible and to recognize genomic novelties in particular bird lineages. The densely sampled alignment provides a single-base-pair map of selection, has more than doubled the fraction of bases that are confidently predicted to be under conservation and reveals extensive patterns of weak selection in predominantly non-coding DNA. Our results demonstrate that increasing the diversity of genomes used in comparative studies can reveal more shared and lineage-specific variation, and improve the investigation of genomic characteristics. We anticipate that this genomic resource will offer new perspectives on evolutionary processes in cross-species comparative analyses and assist in efforts to conserve species.
Microsatellites are the genetic markers of choice for many population genetic studies, but must be isolated de novo using recombinant approaches where prior genetic data are lacking. Here we utilized high-throughput genomic sequencing technology to produce millions of base pairs of short fragment reads, which were screened with bioinformatics toolsets to identify primers that amplify polymorphic microsatellite loci. Using this approach we isolated 13 polymorphic microsatellites for the blue duck (Hymenolaimus malacorhynchos), a species for which limited genetic data were available. Our genomic approach eliminates recombinant genetic steps, significantly reducing the time and cost requirements of marker development compared with traditional approaches. While this application of genomic sequencing may seem obvious to many, this study is, to the best of our knowledge, the first attempt to describe the use of genomic sequencing for the development of microsatellite markers in a non-model organism or indeed any organism.
The major histocompatibility complex (MHC) forms an integral component of the vertebrate immune response and, due to strong selection pressures, is one of the most polymorphic regions of the entire genome. Despite over 15 years of research, empirical studies offer highly contradictory explanations of the relative roles of different evolutionary forces, selection and genetic drift, acting on MHC genes during population bottlenecks. Here, we take a meta-analytical approach to quantify the results of studies into the effects of bottlenecks on MHC polymorphism. We show that the consequences of selection acting on MHC loci prior to a bottleneck event, combined with drift during the bottleneck, will result in overall loss of MHC polymorphism that is ∼15% greater than loss of neutral genetic diversity. These results are counter to general expectations that selection should maintain MHC polymorphism, but do agree with the results of recent simulation models and at least two empirical studies. Notably, our results suggest that negative frequency-dependent selection could be more important than overdominance for maintaining high MHC polymorphism in pre-bottlenecked populations.
Summary 1.Pest eradication is an important facet of conservation and ecological restoration and has been applied successfully to invasive rat species on offshore and oceanic islands. Successful eradication requires the definition of a target population that is of manageable size, with low recolonization risk. We applied a molecular genetic approach to the identification of populations suitable for eradication (eradication units) to provide a new tool to assist the management of brown rats Rattus norvegicus on South Georgia (Southern Ocean). 2. A single eradication attempt on South Georgia (4000 km 2 ) would be an order of magnitude larger than any previously successful rat eradication programme (110 km 2 ). However, rats are demarcated into glacially isolated populations, which could allow sequential eradication. We examined genetic variation at 18 nuclear microsatellite loci to identify gene flow between two glacially isolated rat populations. One population, Greene Peninsula (30 km 2 ), was earmarked for an eradication trial. 3. Genetic diversity in 40 rats sampled from each population showed a pronounced level of genetic population differentiation, allowing individuals to be assigned to the correct population of origin. 4. Our study suggests limited or negligible gene flow between the populations and that glaciers, permanent ice and icy waters restrict rat dispersal on South Georgia. Such barriers define eradication units that, with due care, could be eradicated with low risk of recolonization, hence facilitating the removal of brown rats from South Georgia. 5. Synthesis and applications . We propose that the molecular definition of eradication units is a valuable approach to management as it (i) provides a temporal perspective to gene flow, which is important if dispersal events are rare; (ii) allows an eradication failure (i.e. surviving individuals) to be distinguished from a recolonization event, opening the way for adaptive management in the face of failure; and (iii) can aid the management of pest species in habitat continua by resolving meta-population dynamics, so guiding pest eradication/control strategies. This study further illustrates the developing array of applied ecological issues in which molecular techniques can help guide management.
The measurement of telomere length (TL) is a genetic tool that is beginning to be employed widely in ecological and evolutionary studies as marker of age and fitness. The adoption of this approach has been accelerated by the development of telomere quantitative PCR, which enables the screening of large numbers of samples with little effort. However, the measurement and interpretation of TL change need to be done with a necessary level of rigour that has thus far often been missing where this approach has been employed in an ecological and evolutionary context. In this article, we critically review the literature available on the relationship between TL, age and fitness. We seek to familiarize geneticists, ecologists and evolutionary biologists with the shortcomings of the methods and the most common mistakes made while analysing TL. Prevention of these mistakes will ensure accuracy, reproducibility and comparability of TL studies in different species and allow the identification of ecological and evolutionary principles behind TL dynamics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.