Enrichment analysis is a popular method for analyzing gene sets generated by genome-wide experiments. Here we present a significant update to one of the tools in this domain called Enrichr. Enrichr currently contains a large collection of diverse gene set libraries available for analysis and download. In total, Enrichr currently contains 180 184 annotated gene sets from 102 gene set libraries. New features have been added to Enrichr including the ability to submit fuzzy sets, upload BED files, improved application programming interface and visualization of the results as clustergrams. Overall, Enrichr is a comprehensive resource for curated gene sets and a search engine that accumulates biological knowledge for further biological discoveries. Enrichr is freely available at: http://amp.pharm.mssm.edu/Enrichr.
The rapid expansion of next-generation sequencing has yielded a powerful array of tools to address fundamental biological questions at a scale that was inconceivable just a few years ago. Various genome partitioning strategies to sequence select subsets of the genome have emerged as powerful alternatives to whole genome sequencing in ecological and evolutionary genomic studies. High throughput targeted capture is one such strategy that involves the parallel enrichment of pre-selected genomic regions of interest. The growing use of targeted capture demonstrates its potential power to address a range of research questions, yet these approaches have yet to expand broadly across labs focused on evolutionary and ecological genomics. In part, the use of targeted capture has been hindered by the logistics of capture design and implementation in species without established reference genomes. Here we aim to 1) increase the accessibility of targeted capture to researchers working in non-model taxa by discussing capture methods that circumvent the need of a reference genome, 2) highlight the evolutionary and ecological applications where this approach is emerging as a powerful sequencing strategy, and 3) discuss the future of targeted capture and other genome partitioning approaches in light of the increasing accessibility of whole genome sequencing. Given the practical advantages and increasing feasibility of high-throughput targeted capture, we anticipate an ongoing expansion of capture-based approaches in evolutionary and ecological research, synergistic with an expansion of whole genome sequencing.
The spatial structure of the environment (e.g. the configuration of habitat patches) may play an important role in determining the strength of local adaptation. However, previous studies of habitat heterogeneity and local adaptation have largely been limited to simple landscapes, which poorly represent the multiscale habitat structure common in nature. Here, we use simulations to pursue two goals: (i) we explore how landscape heterogeneity, dispersal ability and selection affect the strength of local adaptation, and (ii) we evaluate the performance of several genotype-environment association (GEA) methods for detecting loci involved in local adaptation. We found that the strength of local adaptation increased in spatially aggregated selection regimes, but remained strong in patchy landscapes when selection was moderate to strong. Weak selection resulted in weak local adaptation that was relatively unaffected by landscape heterogeneity. In general, the power of detection methods closely reflected levels of local adaptation. False-positive rates (FPRs), however, showed distinct differences across GEA methods based on levels of population structure. The univariate GEA approach had high FPRs (up to 55%) under limited dispersal scenarios, due to strong isolation by distance. By contrast, multivariate, ordination-based methods had uniformly low FPRs (0-2%), suggesting these approaches can effectively control for population structure. Specifically, constrained ordinations had the best balance of high detection and low FPRs and will be a useful addition to the GEA toolkit. Our results provide both theoretical and practical insights into the conditions that shape local adaptation and how these conditions impact our ability to detect selection.
Snowshoe hares () maintain seasonal camouflage by molting to a white winter coat, but some hares remain brown during the winter in regions with low snow cover. We show that cis-regulatory variation controlling seasonal expression of the gene underlies this adaptive winter camouflage polymorphism. Genetic variation at clustered by winter coat color across multiple hare and jackrabbit species, revealing a history of recurrent interspecific gene flow. Brown winter coats in snowshoe hares likely originated from an introgressed black-tailed jackrabbit allele that has swept to high frequency in mild winter environments. These discoveries show that introgression of genetic variants that underlie key ecological traits can seed past and ongoing adaptation to rapidly changing environments.
Gene expression data are accumulating exponentially in public repositories. Reanalysis and integration of themed collections from these studies may provide new insights, but requires further human curation.Here we report a crowdsourcing project to annotate and reanalyse a large number of gene expression profiles from Gene Expression Omnibus (GEO). Through a massive open online course on Coursera, over 70 participants from over 25 countries identify and annotate 2,460 single-gene perturbation signatures, 839 disease versus normal signatures, and 906 drug perturbation signatures. All these signatures are unique and are manually validated for quality. Global analysis of these signatures confirms known associations and identifies novel associations between genes, diseases and drugs. The manually curated signatures are used as a training set to develop classifiers for extracting similar signatures from the entire GEO repository. We develop a web portal to serve these signatures for query, download and visualization.
With the increasing availability of both molecular and topo-climatic data, the main challenges facing landscape genomics -that is the combination of landscape ecology with population genomics -include processing large numbers of models and distinguishing between selection and demographic processes (e.g. population structure). Several methods address the latter, either by estimating a null model of population history or by simultaneously inferring environmental and demographic effects. Here we present SAMbADA, an approach designed to study signatures of local adaptation, with special emphasis on high performance computing of large-scale genetic and environmental data sets. SAMbADA identifies candidate loci using genotype-environment associations while also incorporating multivariate analyses to assess the effect of many environmental predictor variables. This enables the inclusion of explanatory variables representing population structure into the models to lower the occurrences of spurious genotype-environment associations. In addition, SAMbADA calculates local indicators of spatial association for candidate loci to provide information on whether similar genotypes tend to cluster in space, which constitutes a useful indication of the possible kinship between individuals. To test the usefulness of this approach, we carried out a simulation study and analysed a data set from Ugandan cattle to detect signatures of local adaptation with SAMbADA, BAYENV, LFMM and an F ST outlier method (FDIST approach in ARLEQUIN) and compare their results. SAMbADA -an open source software for Windows, Linux and Mac OS X available at http://lasig.epfl.ch/sambada -outperforms other approaches and better suits whole-genome sequence data processing.
Evolutionary adaptation to extreme environments often requires coordinated changes in multiple intersecting physiological pathways, but how such multi-trait adaptation occurs remains unresolved. Transcription factors, which regulate the expression of many genes and can simultaneously alter multiple phenotypes, may be common targets of selection if the benefits of induced changes outweigh the costs of negative pleiotropic effects. We combined complimentary population genetic analyses and physiological experiments in North American deer mice (Peromyscus maniculatus) to examine links between genetic variation in transcription factors that coordinate physiological responses to hypoxia (hypoxia-inducible factors, HIFs) and multiple physiological traits that potentially contribute to high-altitude adaptation. First, we sequenced the exomes of 100 mice sampled from different elevations and discovered that several SNPs in the gene Epas1, which encodes the oxygen sensitive subunit of HIF-2α, exhibited extreme allele frequency differences between highland and lowland populations. Broader geographic sampling confirmed that Epas1 genotype varied predictably with altitude throughout the western US. We then discovered that Epas1 genotype influences heart rate in hypoxia, and the transcriptomic responses to hypoxia (including HIF targets and genes involved in catecholamine signaling) in the heart and adrenal gland. Finally, we used a demographically-informed selection scan to show that Epas1 variants have experienced a history of spatially varying selection, suggesting that differences in cardiovascular function and gene regulation contribute to high-altitude adaptation. Our results suggest a mechanism by which Epas1 may aid long-term survival of high-altitude deer mice and provide general insights into the role that highly pleiotropic transcription factors may play in the process of environmental adaptation.
Environmental heterogeneity largely dictates the spatial distributions of parasites and therefore the susceptibility to infection of host populations. We surveyed avian malaria infections in Rufous-collared sparrows (Zonotrichia capensis) across replicated altitudinal and latitudinal transects along the western slope of the Peruvian Andes to assess geographic patterns of prevalence. We found malaria infection prevalence peaked at midelevations along all 3 altitudinal transects (x ≈ 2,733 m), with highest overall prevalence observed in the northern transect. We observed low levels of malarial parasite diversity, with 94% of infected birds carrying a single Haemoproteus (subgenus Parahaemoproteus) strain. The remaining infected birds harbored either a single alternate Haemoproteus or 1 of 2 Plasmodium strains. Our data suggest that temperature and precipitation are the primary drivers of the spatial patterns in avian malaria prevalence along the western slope of the Andes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.