Most of the current knowledge on the genetic basis of adaptive evolution is based on the analysis of single nucleotide polymorphisms (SNPs). Despite increasing evidence for their causal role, the contribution of structural variants to adaptive evolution remains largely unexplored. In this work, we analyzed the population frequencies of 1,615 Transposable Element (TE) insertions annotated in the reference genome of Drosophila melanogaster , in 91 samples from 60 worldwide natural populations. We identified a set of 300 polymorphic TEs that are present at high population frequencies, and located in genomic regions with high recombination rate, where the efficiency of natural selection is high. The age and the length of these 300 TEs are consistent with relatively young and long insertions reaching high frequencies due to the action of positive selection. Besides, we identified a set of 21 fixed TEs also likely to be adaptive. Indeed, we, and others, found evidence of selection for 84 of these reference TE insertions. The analysis of the genes located nearby these 84 candidate adaptive insertions suggested that the functional response to selection is related with the GO categories of response to stimulus, behavior, and development. We further showed that a subset of the candidate adaptive TEs affects expression of nearby genes, and five of them have already been linked to an ecologically relevant phenotypic effect. Our results provide a more complete understanding of the genetic variation and the fitness-related traits relevant for adaptive evolution. Similar studies should help uncover the importance of TE-induced adaptive mutations in other species as well.
79Genetic variation is the fuel of evolution. However, analyzing evolutionary dynamics in 80 natural populations is challenging, sequencing of entire populations remains costly and 81 comprehensive sampling logistically difficult.
Drosophila melanogaster is a leading model in population genetics and genomics, and a growing number of whole-genome datasets from natural populations of this species have been published over the last years. A major challenge is the integration of disparate datasets, often generated using different sequencing technologies and bioinformatic pipelines, which hampers our ability to address questions about the evolution of this species. Here we address these issues by developing a bioinformatics pipeline that maps pooled sequencing (Pool-Seq) reads from D. melanogaster to a hologenome consisting of fly and symbiont genomes and estimates allele frequencies using either a heuristic (PoolSNP) or a probabilistic variant caller (SNAPE-pooled). We use this pipeline to generate the largest data repository of genomic data available for D. melanogaster to date, encompassing 271 previously published and unpublished population samples from over 100 locations in > 20 countries on four continents. Several of these locations have been sampled at different seasons across multiple years. This dataset, which we call Drosophila Evolution over Space and Time (DEST), is coupled with sampling and environmental meta-data. A web-based genome browser and web portal provide easy access to the SNP dataset. We further provide guidelines on how to use Pool-Seq data for model-based demographic inference. Our aim is to provide this scalable platform as a community resource which can be easily extended via future efforts for an even more extensive cosmopolitan dataset. Our resource will enable population geneticists to analyze spatio-temporal genetic patterns and evolutionary dynamics of D. melanogaster populations in unprecedented detail.
18Mapping genotype to phenotype is challenging because of the difficulties in identifying 19 both the traits under selection and the specific genetic variants underlying these traits. Most 20 of the current knowledge of the genetic basis of adaptive evolution is based on the analysis 21 of single nucleotide polymorphisms (SNPs). Despite increasing evidence for their causal 22 role, the contribution of structural variants to adaptive evolution remains largely 23 unexplored. In this work, we analyzed the population frequencies of 1,615 Transposable 24 Element (TE) insertions in 91 samples from 60 worldwide natural populations of 25 Drosophila melanogaster. We identified a set of 300 TEs that are present at high 26 population frequencies, and located in genomic regions with high recombination rate, 27 where the efficiency of natural selection is high. The age and the length of these 300 TEs 28 are consistent with relatively young and long insertions reaching high frequencies due to 29 the action of positive selection. Indeed, we, and others, found evidence of selective sweeps 30 and/or population differentiation for 65 of them. The analysis of the genes located nearby 31 these 65 candidate adaptive insertions suggested that the functional response to selection is 32 related with the GO categories of response to stimulus, behavior, and development. We 33 further showed that a subset of the candidate adaptive TEs affect expression of nearby 34 genes, and five of them have already been linked to an ecologically relevant phenotypic 35 effect. Our results provide a more complete understanding of the genetic variation and the 36 fitness-related traits relevant for adaptive evolution. Similar studies should help uncover the 37 importance of TE-induced adaptive mutations in other species as well.38 39In this work, we screened 303 individual genomes, and 83 pooled samples (containing from 83 30 to 440 chromosomes each) from 60 worldwide natural D. melanogaster populations to 84 identify the TE insertions most likely involved in adaptive evolution (Fig 1). In addition to 85 the age and the size of the 1,615 TEs analyzed, we calculated four different statistics to 86 detect potentially adaptive TEs. The GO enrichment analysis of the genes located nearby 87 our set of candidate adaptive insertions pinpoint response to stimulus, behavior, and 88 development as the traits more likely to be shaped by TE-induced mutations. Consistent 89 with these results, genes located nearby our set of candidate adaptive TEs are significantly 90 enriched for previously identified loci underlying stress-and behavior-related traits. 91Overall, our results suggest a widespread contribution of TEs to adaptive evolution in D. 92 melanogaster and pinpoint relevant traits for adaptation. 93 94 Results 99 Natural populations of D. melanogaster contain hundreds of polymorphic TEs at high 100 population frequencies 101To identify TEs likely to be involved in adaptation, we looked for TEs present at high 102 population frequencies, and located i...
While several studies in a diverse set of species have shed light on the genes underlying adaptation, our knowledge on the selective pressures that explain the observed patterns lags behind. Drosophila melanogaster is a valuable organism to study environmental adaptation because this species originated in Southern Africa and has recently expanded worldwide, and also because it has a functionally well‐annotated genome. In this study, we aimed to decipher which environmental variables are relevant for adaptation of D. melanogaster natural populations in Europe and North America. We analysed 36 whole‐genome pool‐seq samples of D. melanogaster natural populations collected in 20 European and 11 North American locations. We used the BayPass software to identify single nucleotide polymorphisms (SNPs) and transposable elements (TEs) showing signature of adaptive differentiation across populations, as well as significant associations with 59 environmental variables related to temperature, rainfall, evaporation, solar radiation, wind, daylight hours, and soil type. We found that in addition to temperature and rainfall, wind related variables are also relevant for D. melanogaster environmental adaptation. Interestingly, 23%–51% of the genes that showed significant associations with environmental variables were not found overly differentiated across populations. In addition to SNPs, we also identified 10 reference transposable element insertions associated with environmental variables. Our results showed that genome‐environment association analysis can identify adaptive genetic variants that are undetected by population differentiation analysis while also allowing the identification of candidate environmental drivers of adaptation.
Motivation Transposable elements (TEs) constitute a significant proportion of the majority of genomes sequenced to date. TEs are responsible for a considerable fraction of the genetic variation within and among species. Accurate genotyping of TEs in genomes is therefore crucial for a complete identification of the genetic differences among individuals, populations and species. Results In this work, we present a new version of T-lex, a computational pipeline that accurately genotypes and estimates the population frequencies of reference TE insertions using short-read high-throughput sequencing data. In this new version, we have re-designed the T-lex algorithm to integrate the BWA-MEM short-read aligner, which is one of the most accurate short-read mappers and can be launched on longer short-reads (e.g. reads >150 bp). We have added new filtering steps to increase the accuracy of the genotyping, and new parameters that allow the user to control both the minimum and maximum number of reads, and the minimum number of strains to genotype a TE insertion. We also showed for the first time that T-lex3 provides accurate TE calls in a plant genome. Availability and implementation To test the accuracy of T-lex3, we called 1630 individual TE insertions in Drosophila melanogaster, 1600 individual TE insertions in humans, and 3067 individual TE insertions in the rice genome. We showed that this new version of T-lex is a broadly applicable and accurate tool for genotyping and estimating TE frequencies in organisms with different genome sizes and different TE contents. T-lex3 is available at Github: https://github.com/GonzalezLab/T-lex3. Supplementary information Supplementary data are available at Bioinformatics online.
Drosophila melanogaster is a leading model in population genetics and genomics, and a growing number of whole-genome datasets from natural populations of this species have been published over the last 20 years. A major challenge is the integration of these disparate datasets, often generated using different sequencing technologies and bioinformatic pipelines, which hampers our ability to address questions about the evolution and population structure of this species. Here we address these issues by developing a bioinformatics pipeline that maps pooled sequencing (Pool-Seq) reads from D. melanogaster to a hologenome consisting of fly and symbiont genomes and estimates allele frequencies using either a heuristic (PoolSNP) or a probabilistic variant caller (SNAPE-pooled). We use this pipeline to generate the largest data repository of genomic data available for D. melanogaster to date, encompassing 271 population samples from over 100 locations in >20 countries on four continents based on a combination of 121 unpublished and 150 previously published genomic datasets. Several of these locations have been sampled at different seasons across multiple years. This dataset, which we call Drosophila Evolution over Space and Time (DEST), is coupled with sampling and environmental meta-data. A web-based genome browser and web portal provide easy access to the SNP dataset. Our aim is to provide this scalable platform as a community resource which can be easily extended via future efforts for an even more extensive cosmopolitan dataset. Our resource will enable population geneticists to analyze spatio-temporal genetic patterns and evolutionary dynamics of D. melanogaster populations in unprecedented detail.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.