Genome assemblies are currently being produced at an impressive rate by consortia and individual laboratories. The low costs and increasing efficiency of sequencing technologies now enable assembling genomes at unprecedented quality and contiguity. However, the difficulty in assembling repeat‐rich and GC‐rich regions (genomic “dark matter”) limits insights into the evolution of genome structure and regulatory networks. Here, we compare the efficiency of currently available sequencing technologies (short/linked/long reads and proximity ligation maps) and combinations thereof in assembling genomic dark matter. By adopting different de novo assembly strategies, we compare individual draft assemblies to a curated multiplatform reference assembly and identify the genomic features that cause gaps within each assembly. We show that a multiplatform assembly implementing long‐read, linked‐read and proximity sequencing technologies performs best at recovering transposable elements, multicopy MHC genes, GC‐rich microchromosomes and the repeat‐rich W chromosome. Telomere‐to‐telomere assemblies are not a reality yet for most organisms, but by leveraging technology choice it is now possible to minimize genome assembly gaps for downstream analysis. We provide a roadmap to tailor sequencing projects for optimized completeness of both the coding and noncoding parts of nonmodel genomes.
Building the Tree of Life (ToL) is a major challenge of modern biology, requiring advances in cyberinfrastructure, data collection, theory, and more. Here, we argue that phylogenomics stands to benefit by embracing the many heterogeneous genomic signals emerging from the first decade of large-scale phylogenetic analysis spawned by high-throughput sequencing (HTS). Such signals include those most commonly encountered in phylogenomic datasets, such as incomplete lineage sorting, but also those reticulate processes emerging with greater frequency, such as recombination and introgression. Here we focus specifically on how phylogenetic methods can accommodate the heterogeneity incurred by such population genetic processes; we do not discuss phylogenetic methods that ignore such processes, such as concatenation or supermatrix approaches or supertrees. We suggest that methods of data acquisition and the types of markers used in phylogenomics will remain restricted until a posteriori methods of marker choice are made possible with routine whole-genome sequencing of taxa of interest. We discuss limitations and potential extensions of a model supporting innovation in phylogenomics today, the multispecies coalescent model (MSC). Macroevolutionary models that use phylogenies, such as character mapping, often ignore the heterogeneity on which building phylogenies increasingly rely and suggest that assimilating such heterogeneity is an important goal moving forward. Finally, we argue that an integrative cyberinfrastructure linking all steps of the process of building the ToL, from specimen acquisition in the field to publication and tracking of phylogenomic data, as well as a culture that values contributors at each step, are essential for progress.
Accurate gene tree inference is an important aspect of species tree estimation in a summary-coalescent framework. Yet, in empirical studies, inferred gene trees differ in accuracy due to stochastic variation in phylogenetic signal between targeted loci. Empiricists should, therefore, examine the consistency of species tree inference, while accounting for the observed heterogeneity in gene tree resolution of phylogenomic data sets. Here, we assess the impact of gene tree estimation error on summary-coalescent species tree inference by screening ${\sim}2000$ exonic loci based on gene tree resolution prior to phylogenetic inference. We focus on a phylogenetically challenging radiation of Australian lizards (genus Cryptoblepharus, Scincidae) and explore effects on topology and support. We identify a well-supported topology based on all loci and find that a relatively small number of high-resolution gene trees can be sufficient to converge on the same topology. Adding gene trees with decreasing resolution produced a generally consistent topology, and increased confidence for specific bipartitions that were poorly supported when using a small number of informative loci. This corroborates coalescent-based simulation studies that have highlighted the need for a large number of loci to confidently resolve challenging relationships and refutes the notion that low-resolution gene trees introduce phylogenetic noise. Further, our study also highlights the value of quantifying changes in nodal support across locus subsets of increasing size (but decreasing gene tree resolution). Such detailed analyses can reveal anomalous fluctuations in support at some nodes, suggesting the possibility of model violation. By characterizing the heterogeneity in phylogenetic signal among loci, we can account for uncertainty in gene tree estimation and assess its effect on the consistency of the species tree estimate. We suggest that the evaluation of gene tree resolution should be incorporated in the analysis of empirical phylogenomic data sets. This will ultimately increase our confidence in species tree estimation using summary-coalescent methods and enable us to exploit genomic data for phylogenetic inference. [Coalescence; concatenation; Cryptoblepharus; exon capture; gene tree; phylogenomics; species tree.].
The association of chromosome rearrangements (CRs) with speciation is well established, and there is a long history of theory and evidence relating to “chromosomal speciation.” Genomic sequencing has the potential to provide new insights into how reorganization of genome structure promotes divergence, and in model systems has demonstrated reduced gene flow in rearranged segments. However, there are limits to what we can understand from a small number of model systems, which each only tell us about one episode of chromosomal speciation. Progressing from patterns of association between chromosome (and genic) change, to understanding processes of speciation requires both comparative studies across diverse systems and integration of genome-scale sequence comparisons with other lines of evidence. Here, we showcase a promising example of chromosomal speciation in a non-model organism, the endemic Australian marsupial genus Petrogale. We present initial phylogenetic results from exon-capture that resolve a history of divergence associated with extensive and repeated CRs. Yet it remains challenging to disentangle gene tree heterogeneity caused by recent divergence and gene flow in this and other such recent radiations. We outline a way forward for better integration of comparative genomic sequence data with evidence from molecular cytogenetics, and analyses of shifts in the recombination landscape and potential disruption of meiotic segregation and epigenetic programming. In all likelihood, CRs impact multiple cellular processes and these effects need to be considered together, along with effects of genic divergence. Understanding the effects of CRs together with genic divergence will require development of more integrative theory and inference methods. Together, new data and analysis tools will combine to shed light on long standing questions of how chromosome and genic divergence promote speciation.
Recent radiations are important to evolutionary biologists, because they provide an opportunity to study the mechanisms that link micro-and macroevolution. The role of ecological speciation during adaptive radiation has been intensively studied, but radiations can arise from a diversity of evolutionary processes; in particular, on large continental landmasses where allopatric speciation might frequently precede ecological differentiation. It is therefore important to establish a phylogenetic and ecological framework for recent continental-scale radiations that are species-rich and ecologically diverse. Here, we use a genomic (approx. 1 200 loci, exon capture) approach to fit branch lengths on a summary-coalescent species tree and generate a time-calibrated phylogeny for a recent and ecologically diverse radiation of Australian scincid lizards; the genus Cryptoblepharus. We then combine the phylogeny with a comprehensive phenotypic dataset for over 800 individuals across the 26 species, and use comparative methods to test whether habitat specialization can explain current patterns of phenotypic variation in ecologically relevant traits. We find significant differences in morphology between species that occur in distinct environments and convergence in ecomorphology with repeated habitat shifts across the continent. These results suggest that isolated analogous habitats have provided parallel ecological opportunity and have repeatedly promoted adaptive diversification. By contrast, speciation processes within the same habitat have resulted in distinct lineages with relatively limited morphological variation. Overall, our study illustrates how alternative diversification processes might have jointly stimulated species proliferation across the continent and generated a remarkably diverse group of Australian lizards.
29Songbirds have a species number almost equivalent to that of mammals, and are classic 30 models for studying mechanisms of speciation and sexual selection. Sex chromosomes are 31 hotspots of both processes, yet their evolutionary history in songbirds remains unclear. To 32 elucidate that, we characterize female genomes of 11 songbird species having ZW sex 33 chromosomes, with 5 genomes of bird-of-paradise species newly produced in this work. We 34 conclude that songbird sex chromosomes have undergone at least four steps of recombination 35 suppression before their species radiation, producing a gradient pattern of pairwise sequence 36 divergence termed 'evolutionary strata'. Interestingly, the latest stratum probably emerged due 37 to a songbird-specific burst of retrotransposon CR1-E1 elements at its boundary, or 38 chromosome inversion on the W chromosome. The formation of evolutionary strata has 39 reshaped the genomic architecture of both sex chromosomes. We find stepwise variations of Z-40 linked inversions, repeat and GC contents, as well as W-linked gene loss rate that are 41 associated with the age of strata. Over 30 W-linked genes have been preserved for their 42 essential functions, indicated by their higher and broader expression of orthologs in lizard than 43 those of other sex-linked genes. We also find a different degree of accelerated evolution of Z-44 linked genes vs. autosomal genes among different species, potentially reflecting their diversified 45 intensity of sexual selection. Our results uncover the dynamic evolutionary history of songbird 46 sex chromosomes, and provide novel insights into the mechanisms of recombination 47
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.