Searching for Darwinian selection in natural populations has been the focus of a multitude of studies over the last decades. Here we present the 1000 Genomes Selection Browser 1.0 (http://hsb.upf.edu) as a resource for signatures of recent natural selection in modern humans. We have implemented and applied a large number of neutrality tests as well as summary statistics informative for the action of selection such as Tajima’s D, CLR, Fay and Wu’s H, Fu and Li’s F* and D*, XPEHH, ΔiHH, iHS, FST, ΔDAF and XPCLR among others to low coverage sequencing data from the 1000 genomes project (Phase 1; release April 2012). We have implemented a publicly available genome-wide browser to communicate the results from three different populations of West African, Northern European and East Asian ancestry (YRI, CEU, CHB). Information is provided in UCSC-style format to facilitate the integration with the rich UCSC browser tracks and an access page is provided with instructions and for convenient visualization. We believe that this expandable resource will facilitate the interpretation of signals of selection on different temporal, geographical and genomic scales.
BackgroundThe extended light-harvesting complex (LHC) protein superfamily is a centerpiece of eukaryotic photosynthesis, comprising the LHC family and several families involved in photoprotection, like the LHC-like and the photosystem II subunit S (PSBS). The evolution of this complex superfamily has long remained elusive, partially due to previously missing families.ResultsIn this study we present a meticulous search for LHC-like sequences in public genome and expressed sequence tag databases covering twelve representative photosynthetic eukaryotes from the three primary lineages of plants (Plantae): glaucophytes, red algae and green plants (Viridiplantae). By introducing a coherent classification of the different protein families based on both, hidden Markov model analyses and structural predictions, numerous new LHC-like sequences were identified and several new families were described, including the red lineage chlorophyll a/b-binding-like protein (RedCAP) family from red algae and diatoms. The test of alternative topologies of sequences of the highly conserved chlorophyll-binding core structure of LHC and PSBS proteins significantly supports the independent origins of LHC and PSBS families via two unrelated internal gene duplication events. This result was confirmed by the application of cluster likelihood mapping.ConclusionsThe independent evolution of LHC and PSBS families is supported by strong phylogenetic evidence. In addition, a possible origin of LHC and PSBS families from different homologous members of the stress-enhanced protein subfamily, a diverse and anciently paralogous group of two-helix proteins, seems likely. The new hypothesis for the evolution of the extended LHC protein superfamily proposed here is in agreement with the character evolution analysis that incorporates the distribution of families and subfamilies across taxonomic lineages. Intriguingly, stress-enhanced proteins, which are universally found in the genomes of green plants, red algae, glaucophytes and in diatoms with complex plastids, could represent an important and previously missing link in the evolution of the extended LHC protein superfamily.
The genome-wide results for three human populations from The 1000 Genomes Project and an R-package implementing the 'Hierarchical Boosting' framework are available at http://hsb.upf.edu/.
The ∼ ∼ ∼ ∼ 500 species of the cichlid fish species flock of Lake Victoria, East Africa, have evolved in a record-setting 100 000 years and represent one of the largest adaptive radiations. We examined the population structure of the endangered cichlid species Xystichromis phytophagus from Lake Kanyaboli, a satellite lake to Lake Victoria in the Kenyan Yala wetlands. Two sets of molecular markers were analysed -sequences of the mitochondrial control region as well as six microsatellite loci -and revealed surprisingly high levels of genetic variability in this species. Mitochondrial DNA sequences failed to detect population structuring among the three sample populations. A model-based population assignment test based on microsatellite data revealed that the three populations most probably aggregate into a larger panmictic population. However, values of population pairwise F ST indicated moderate levels of genetic differentiation for one population. Eleven distinct mitochondrial haplotypes were found among 205 specimens of X. phytophagus , a relatively high number compared to the total number of 54 haplotypes that were recovered from hundreds of specimens of the entire cichlid species flock of Lake Victoria. Most of the X. phytophagus mitochondrial DNA haplotypes were absent from the main Lake Victoria, corroborating the putative importance of satellite lakes as refugia for haplochromine cichlids that went extinct from the main lake in the last decades and possibly during the Late Pleistocene desiccation of Lake Victoria.
BackgroundThe only known albino gorilla, named Snowflake, was a male wild born individual from Equatorial Guinea who lived at the Barcelona Zoo for almost 40 years. He was diagnosed with non-syndromic oculocutaneous albinism, i.e. white hair, light eyes, pink skin, photophobia and reduced visual acuity. Despite previous efforts to explain the genetic cause, this is still unknown. Here, we study the genetic cause of his albinism and making use of whole genome sequencing data we find a higher inbreeding coefficient compared to other gorillas.ResultsWe successfully identified the causal genetic variant for Snowflake’s albinism, a non-synonymous single nucleotide variant located in a transmembrane region of SLC45A2. This transporter is known to be involved in oculocutaneous albinism type 4 (OCA4) in humans. We provide experimental evidence that shows that this amino acid replacement alters the membrane spanning capability of this transmembrane region. Finally, we provide a comprehensive study of genome-wide patterns of autozygogosity revealing that Snowflake’s parents were related, being this the first report of inbreeding in a wild born Western lowland gorilla.ConclusionsIn this study we demonstrate how the use of whole genome sequencing can be extended to link genotype and phenotype in non-model organisms and it can be a powerful tool in conservation genetics (e.g., inbreeding and genetic diversity) with the expected decrease in sequencing cost.
The prevalence of non-insulin-dependent diabetes mellitus (type II diabetes) in Polynesia is among the highest recorded worldwide and is substantially higher than in neighboring human populations. Such large differences in the frequency of a phenotype between populations may be explained by large allele frequency differences between populations in genes associated with the phenotype. To identify genes that may explain the high between-population variation in type II diabetes prevalence in the Pacific, we determined the frequency of 10 type II diabetes-associated alleles in 23 Polynesians, 23 highland New Guineans and 19 Han Chinese, calculated population-pairwise Fst values for each allele and compared these values to the distribution of Fst values from B100 000 SNPs from the same populations. The susceptibility allele in the PPARGC1A gene is at a frequency of 0.717 in Polynesians, 0.368 in Chinese but is absent in the New Guineans. The striking frequency difference between Polynesians and New Guineans is highly unusual (Fst ¼ 0.703, P ¼ 0.007) and we therefore suggest that this allele may play a role in the large difference in type II diabetes prevalence between Polynesians and neighboring populations.
Essential trace elements possess vital functions at molecular, cellular, and physiological levels in health and disease, and they are tightly regulated in the human body. In order to assess variability and potential adaptive evolution of trace element homeostasis, we quantified 18 trace elements in 150 liver samples, together with the expression levels of 90 genes and abundances of 40 proteins involved in their homeostasis. Additionally, we genotyped 169 single nucleotide polymorphism (SNPs) in the same sample set. We detected significant associations for 8 protein quantitative trait loci (pQTL), 10 expression quantitative trait loci (eQTLs), and 15 micronutrient quantitative trait loci (nutriQTL). Six of these exceeded the false discovery rate cutoff and were related to essential trace elements: 1) one pQTL for GPX2 (rs10133290); 2) two previously described eQTLs for HFE (rs12346) and SELO (rs4838862) expression; and 3) three nutriQTLs: The pathogenic C282Y mutation at HFE affecting iron (rs1800562), and two SNPs within several clustered metallothionein genes determining selenium concentration (rs1811322 and rs904773). Within the complete set of significant QTLs (which involved 30 SNPs and 20 gene regions), we identified 12 SNPs with extreme patterns of population differentiation (FST values in the top 5% percentile in at least one HapMap population pair) and significant evidence for selective sweeps involving QTLs at GPX1, SELENBP1, GPX3, SLC30A9, and SLC39A8. Overall, this detailed study of various molecular phenotypes illustrates the role of regulatory variants in explaining differences in trace element homeostasis among populations and in the human adaptive response to environmental pressures related to micronutrients.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.