Drosophila melanogaster has played a pivotal role in the development of modern population genetics. However, many basic questions regarding the demographic and adaptive history of this species remain unresolved. We report the genome sequencing of 139 wild-derived strains of D. melanogaster, representing 22 population samples from the sub-Saharan ancestral range of this species, along with one European population. Most genomes were sequenced above 25X depth from haploid embryos. Results indicated a pervasive influence of non-African admixture in many African populations, motivating the development and application of a novel admixture detection method. Admixture proportions varied among populations, with greater admixture in urban locations. Admixture levels also varied across the genome, with localized peaks and valleys suggestive of a non-neutral introgression process. Genomes from the same location differed starkly in ancestry, suggesting that isolation mechanisms may exist within African populations. After removing putatively admixed genomic segments, the greatest genetic diversity was observed in southern Africa (e.g. Zambia), while diversity in other populations was largely consistent with a geographic expansion from this potentially ancestral region. The European population showed different levels of diversity reduction on each chromosome arm, and some African populations displayed chromosome arm-specific diversity reductions. Inversions in the European sample were associated with strong elevations in diversity across chromosome arms. Genomic scans were conducted to identify loci that may represent targets of positive selection within an African population, between African populations, and between European and African populations. A disproportionate number of candidate selective sweep regions were located near genes with varied roles in gene regulation. Outliers for Europe-Africa FST were found to be enriched in genomic regions of locally elevated cosmopolitan admixture, possibly reflecting a role for some of these loci in driving the introgression of non-African alleles into African populations.
Mammalian sex chromosomes have undergone profound changes since evolving from ancestral autosomes. By examining retroposed genes in the human and mouse genomes, we demonstrate that, during evolution, the mammalian X chromosome has generated and recruited a disproportionately high number of functional retroposed genes, whereas the autosomes experienced lower gene turnover. Most autosomal copies originating from X-linked genes exhibited testis-biased expression. Such export is incompatible with mutational bias and is likely driven by natural selection to attain male germline function. However, the excess recruitment is consistent with a combination of both natural selection and mutational bias.
Analysis of the genomes and transcriptomes of snake species with homomorphic and heteromorphic sex chromosomes reveals the evolutionary dynamics of sex chromosome differentiation.
The role that natural selection plays in governing the locations and early evolution of copy-number mutations remains largely unexplored. We used high-density full-genome tiling arrays to create a fine-scale genomic map of copy-number polymorphisms (CNPs) in Drosophila melanogaster. We inferred a total of 2658 independent CNPs, 56% of which overlap genes. These include CNPs that are likely to be under positive selection, most notably high-frequency duplications encompassing toxin-response genes. The locations and frequencies of CNPs are strongly shaped by purifying selection, with deletions under stronger purifying selection than duplications. Among duplications, those overlapping exons or introns, as well as those falling on the X chromosome, seem to be subject to stronger purifying selection.
Genome assemblies that are accurate, complete and contiguous are essential for identifying important structural and functional elements of genomes and for identifying genetic variation. Nevertheless, most recent genome assemblies remain incomplete and fragmented. While long molecule sequencing promises to deliver more complete genome assemblies with fewer gaps, concerns about error rates, low yields, stringent DNA requirements and uncertainty about best practices may discourage many investigators from adopting this technology. Here, in conjunction with the platinum standard Drosophila melanogaster reference genome, we analyze recently published long molecule sequencing data to identify what governs completeness and contiguity of genome assemblies. We also present a hybrid meta-assembly approach that achieves remarkable assembly contiguity for both Drosophila and human assemblies with only modest long molecule sequencing coverage. Our results motivate a set of preliminary best practices for obtaining accurate and contiguous assemblies, a ‘missing manual’ that guides key decisions in building high quality de novo genome assemblies, from DNA isolation to polishing the assembly.
Abstract:Genome assemblies that are accurate, complete, and contiguous are essential for identifying important structural and functional elements of genomes and for identifying genetic variation. Nevertheless, most recent genome assemblies remain incomplete and fragmented. While long molecule sequencing promises to deliver more complete genome assemblies with fewer gaps, concerns about error rates, low yields, stringent DNA requirements, and uncertainty about best practices may discourage many investigators from adopting this technology. Here, in conjunction with the platinum standard Drosophila melanogaster reference genome, we analyze recently published long molecule sequencing data to identify what governs completeness and contiguity of genome assemblies. We also present a hybrid meta-assembly approach that achieves remarkable assembly contiguity for both Drosophila and human assemblies with only modest long molecule sequencing coverage. Our results motivate a set of preliminary best practices for obtaining accurate and contiguous assemblies, a "missing manual" that guides key decisions in building high quality de novo genome assemblies, from DNA isolation to polishing the assembly.
It has been hypothesized that individually-rare hidden structural variants (SVs) could account for a significant fraction of variation in complex traits. Here we identified more than 20,000 euchromatic SVs from 14 Drosophila melanogaster genome assemblies, of which ~40% are invisible to high specificity short-read genotyping approaches. SVs are common, with 31.5% of diploid individuals harboring a SV in genes larger than 5kb, and 24% harboring multiple SVs in genes larger than 10kb. SV minor allele frequencies are rarer than amino acid polymorphisms, suggesting that SVs are more deleterious. We show that a number of functionally important genes harbor previously hidden structural variants likely to affect complex phenotypes. Furthermore, SVs are overrepresented in candidate genes associated with quantitative trait loci mapped using the Drosophila Synthetic Population Resource. We conclude that SVs are ubiquitous, frequently constitute a heterogeneous allelic series, and can act as rare alleles of large effect.
Gene expression is regulated both by cis elements, which are DNA segments closely linked to the genes they regulate, and by trans factors, which are usually proteins capable of diffusing to unlinked genes. Understanding the patterns and sources of regulatory variation is crucial for understanding phenotypic and genome evolution. Here, we measure genome-wide allele-specific expression by deep sequencing to investigate the patterns of cis and trans expression variation between two strains of Saccharomyces cerevisiae. We propose a statistical modeling framework based on the binomial distribution that simultaneously addresses normalization of read counts derived from different parents and estimating the cis and trans expression variation parameters. We find that expression polymorphism in yeast is common for both cis and trans, though trans variation is more common. Constraint in expression evolution is correlated with other hallmarks of constraint, including gene essentiality, number of protein interaction partners, and constraint in amino acid substitution, indicating that both cis and trans polymorphism are clearly under purifying selection, though trans variation appears to be more sensitive to selective constraint. Comparing interspecific expression divergence between S. cerevisiae and S. paradoxus to our intraspecific variation suggests a significant departure from a neutral model of molecular evolution. A further examination of correlation between polymorphism and divergence within each category suggests that cis divergence is more frequently mediated by positive Darwinian selection than is trans divergence.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.