Skin pigmentation is one of the most variable phenotypic traits in humans. A non-synonymous substitution (rs1426654) in the third exon of SLC24A5 accounts for lighter skin in Europeans but not in East Asians. A previous genome-wide association study carried out in a heterogeneous sample of UK immigrants of South Asian descent suggested that this gene also contributes significantly to skin pigmentation variation among South Asians. In the present study, we have quantitatively assessed skin pigmentation for a largely homogeneous cohort of 1228 individuals from the Southern region of the Indian subcontinent. Our data confirm significant association of rs1426654 SNP with skin pigmentation, explaining about 27% of total phenotypic variation in the cohort studied. Our extensive survey of the polymorphism in 1573 individuals from 54 ethnic populations across the Indian subcontinent reveals wide presence of the derived-A allele, although the frequencies vary substantially among populations. We also show that the geospatial pattern of this allele is complex, but most importantly, reflects strong influence of language, geography and demographic history of the populations. Sequencing 11.74 kb of SLC24A5 in 95 individuals worldwide reveals that the rs1426654-A alleles in South Asian and West Eurasian populations are monophyletic and occur on the background of a common haplotype that is characterized by low genetic diversity. We date the coalescence of the light skin associated allele at 22–28 KYA. Both our sequence and genome-wide genotype data confirm that this gene has been a target for positive selection among Europeans. However, the latter also shows additional evidence of selection in populations of the Middle East, Central Asia, Pakistan and North India but not in South India.
The Slavic branch of the Balto-Slavic sub-family of Indo-European languages underwent rapid divergence as a result of the spatial expansion of its speakers from Central-East Europe, in early medieval times. This expansion–mainly to East Europe and the northern Balkans–resulted in the incorporation of genetic components from numerous autochthonous populations into the Slavic gene pools. Here, we characterize genetic variation in all extant ethnic groups speaking Balto-Slavic languages by analyzing mitochondrial DNA (n = 6,876), Y-chromosomes (n = 6,079) and genome-wide SNP profiles (n = 296), within the context of other European populations. We also reassess the phylogeny of Slavic languages within the Balto-Slavic branch of Indo-European. We find that genetic distances among Balto-Slavic populations, based on autosomal and Y-chromosomal loci, show a high correlation (0.9) both with each other and with geography, but a slightly lower correlation (0.7) with mitochondrial DNA and linguistic affiliation. The data suggest that genetic diversity of the present-day Slavs was predominantly shaped in situ, and we detect two different substrata: ‘central-east European’ for West and East Slavs, and ‘south-east European’ for South Slavs. A pattern of distribution of segments identical by descent between groups of East-West and South Slavs suggests shared ancestry or a modest gene flow between those two groups, which might derive from the historic spread of Slavic people.
BackgroundPlasmids play an important role in the dissemination of antibiotic resistance, making their detection an important task. Using whole genome sequencing (WGS), it is possible to capture both bacterial and plasmid sequence data, but short read lengths make plasmid detection a complex problem.ResultsWe developed a tool named PlasmidSeeker that enables the detection of plasmids from bacterial WGS data without read assembly. The PlasmidSeeker algorithm is based on k-mers and uses k-mer abundance to distinguish between plasmid and bacterial sequences. We tested the performance of PlasmidSeeker on a set of simulated and real bacterial WGS samples, resulting in 100% sensitivity and 99.98% specificity.ConclusionPlasmidSeeker enables quick detection of known plasmids and complements existing tools that assemble plasmids de novo. The PlasmidSeeker source code is stored on GitHub: .
Several recent studies detected fine-scale genetic structure in human populations. Hence, groups conventionally treated as single populations harbour significant variation in terms of allele frequencies and patterns of haplotype sharing. It has been shown that these findings should be considered when performing studies of genetic associations and natural selection, especially when dealing with polygenic phenotypes. However, there is little understanding of the practical effects of such genetic structure on demography reconstructions and selection scans when focusing on recent population history. Here we tested the impact of population structure on such inferences using high-coverage (~30×) genome sequences of 2305 Estonians. We show that different regions of Estonia differ in both effective population size dynamics and signatures of natural selection. By analyzing identity-by-descent segments we also reveal that some Estonian regions exhibit evidence of a bottleneck 10–15 generations ago reflecting sequential episodes of wars, plague and famine, although this signal is virtually undetected when treating Estonia as a single population. Besides that, we provide a framework for relating effective population size estimated from genetic data to actual census size and validate it on the Estonian population. This approach may be widely used both to cross-check estimates based on historical sources as well as to get insight into times and/or regions with no other information available. Our results suggest that the history of human populations within the last few millennia can be highly region specific and cannot be properly studied without taking local genetic structure into account.
We have developed a computational method that counts the frequencies of unique k-mers in FASTQ-formatted genome data and uses this information to infer the genotypes of known variants. FastGT can detect the variants in a 30x genome in less than 1 hour using ordinary low-cost server hardware. The overall concordance with the genotypes of two Illumina “Platinum” genomes is 99.96%, and the concordance with the genotypes of the Illumina HumanOmniExpress is 99.82%. Our method provides k-mer database that can be used for the simultaneous genotyping of approximately 30 million single nucleotide variants (SNVs), including >23,000 SNVs from Y chromosome. The source code of FastGT software is available at GitHub (https://github.com/bioinfo-ut/GenomeTester4/).
BackgroundFast, accurate and high-throughput identification of bacterial isolates is in great demand. The present work was conducted to investigate the possibility of identifying isolates from unassembled next-generation sequencing reads using custom-made guide trees.ResultsA tool named StrainSeeker was developed that constructs a list of specific k-mers for each node of any given Newick-format tree and enables the identification of bacterial isolates in 1–2 min. It uses a novel algorithm, which analyses the observed and expected fractions of node-specific k-mers to test the presence of each node in the sample. This allows StrainSeeker to determine where the isolate branches off the guide tree and assign it to a clade whereas other tools assign each read to a reference genome. Using a dataset of 100 Escherichia coli isolates, we demonstrate that StrainSeeker can predict the clades of E. coli with 92% accuracy and correct tree branch assignment with 98% accuracy. Twenty-five thousand Illumina HiSeq reads are sufficient for identification of the strain.ConclusionStrainSeeker is a software program that identifies bacterial isolates by assigning them to nodes or leaves of a custom-made guide tree. StrainSeeker’s web interface and pre-computed guide trees are available at . Source code is stored at GitHub: .
IntroductionRecurrent miscarriage (RM; ≥3 consecutive pregnancy losses) occurs in 1–3% of fertile couples. No biomarkers with high predictive value of threatening miscarriage have been identified. We aimed to profile whole-genome differential gene expression in RM placental tissue, and to determine the protein levels of identified loci in maternal sera in early pregnancy.MethodsGeneChips (Affymetrix®) were used for discovery and Taqman RT-qPCR assays for replication of mRNA expression in placentas from RM cases (n = 13) compared to uncomplicated pregnancies matched for gestational age (n = 23). Concentrations of soluble TRAIL (sTRAIL) and calprotectin in maternal serum in normal first trimester (n = 35) and failed pregnancies (early miscarriage, n = 18, late miscarriage, n = 4; tubal pregnancy, n = 11) were determined using ELISA.ResultsIn RM placentas 30 differentially expressed (with nominal P-value < 0.05) transcripts were identified. Significantly increased placental mRNA expression of TNF-related apoptosis-inducing ligand (TRAIL; P = 1.4 × 10−3; fold-change 1.68) and S100A8 (P = 7.9 × 10−4; fold-change 2.56) encoding for inflammatory marker calprotectin (S100A8/A9) was confirmed by RT-qPCR. When compared to normal first trimester pregnancy (sTRAIL 16.1 ± 1.6 pg/ml), significantly higher maternal serum concentration of sTRAIL was detected at the RM event (33.6 ± 4.3 pg/ml, P = 0.00027), and in pregnant women, who developed an unpredicted miscarriage 2–50 days after prospective serum sampling (28.5 ± 4.4 pg/ml, P = 0.039). Women with tubal pregnancy also exhibited elevated sTRAIL (30.5 ± 3.9 pg/ml, P = 0.035). Maternal serum levels of calprotectin were neither diagnostic nor prognostic to early pregnancy failures (P > 0.05).ConclusionsThe study indicated of sTRAIL as a potential predictive biomarker in maternal serum for early pregnancy complications.
Our understanding of the genetics of skin pigmentation has been largely skewed towards populations of European ancestry, imparting less attention to South Asian populations, who behold huge pigmentation diversity. Here, we investigate skin pigmentation variation in a cohort of 1,167 individuals in the Middle Gangetic Plain of the Indian subcontinent. Our data confirm the association of rs1426654 with skin pigmentation among South Asians, consistent with previous studies, and also show association for rs2470102 single nucleotide polymorphism. Our haplotype analyses further help us delineate the haplotype distribution across social categories and skin color. Taken together, our findings suggest that the social structure defined by the caste system in India has a profound influence on the skin pigmentation patterns of the subcontinent. In particular, social category and associated single nucleotide polymorphisms explain about 32% and 6.4%, respectively, of the total phenotypic variance. Phylogeography of the associated single nucleotide polymorphisms studied across 52 diverse populations of the Indian subcontinent shows wide presence of the derived alleles, although their frequencies vary across populations. Our results show that both polymorphisms (rs1426654 and rs2470102) play an important role in the skin pigmentation diversity of South Asians.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.