Haojing Shao scite author profile

The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

show abstract

Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences

Poznik

et al. 2016

View full text Add to dashboard Cite

We report the sequences of 1,244 human Y chromosomes randomly ascertained from 26 worldwide populations by the 1000 Genomes Project. We discovered more than 65,000 variants, including SNVs, MNVs, indels, STRs, and CNVs. Of these, CNVs contribute the greatest predicted functional impact. We constructed a calibrated phylogenetic tree based on binary SNVs and projected the more complex variants onto it, estimating the numbers of mutations for each class. Our phylogeny reveals bursts of extreme expansions in male numbers that have occurred independently among each of the five continental superpopulations examined, at times of known migrations and technological innovations.

show abstract

Complete Resequencing of 40 Genomes Reveals Domestication Events and Genes in Silkworm ( Bombyx )

Xia

Guo

Zhang

et al. 2009

Science

318

298

View full text Add to dashboard Cite

A single–base pair resolution silkworm genetic variation map was constructed from 40 domesticated and wild silkworms, each sequenced to approximately threefold coverage, representing 99.88% of the genome. We identified ∼16 million single-nucleotide polymorphisms, many indels, and structural variations. We find that the domesticated silkworms are clearly genetically differentiated from the wild ones, but they have maintained large levels of genetic variability, suggesting a short domestication event involving a large number of individuals. We also identified signals of selection at 354 candidate genes that may have been important during domestication, some of which have enriched expression in the silk gland, midgut, and testis. These data add to our understanding of the domestication processes and may have applications in devising pest control strategies and advancing the use of silkworms as efficient bioreactors.

show abstract

Deep sequencing of the MHC region in the Chinese population contributes to studies of complex disease

Zhou¹,

Cao²,

Zuo³

et al. 2016

Nat Genet

180

226

View full text Add to dashboard Cite

The human major histocompatibility complex (MHC) region has been shown to be associated with numerous diseases. However, it remains a challenge to pinpoint the causal variants for these associations because of the extreme complexity of the region. We thus sequenced the entire 5-Mb MHC region in 20,635 individuals of Han Chinese ancestry (10,689 controls and 9,946 patients with psoriasis) and constructed a Han-MHC database that includes both variants and HLA gene typing results of high accuracy. We further identified multiple independent new susceptibility loci in HLA-C, HLA-B, HLA-DPB1 and BTNL2 and an intergenic variant, rs118179173, associated with psoriasis and confirmed the well-established risk allele HLA-C*06:02. We anticipate that our Han-MHC reference panel built by deep sequencing of a large number of samples will serve as a useful tool for investigating the role of the MHC region in a variety of diseases and thus advance understanding of the pathogenesis of these disorders.

show abstract

A large-scale screen for coding variants predisposing to psoriasis

et al. 2013

View full text Add to dashboard Cite

To explore the contribution of functional coding variants to psoriasis, we analyzed nonsynonymous single-nucleotide variants (SNVs) across the genome by exome sequencing in 781 psoriasis cases and 676 controls and through follow-up validation in 1,326 candidate genes by targeted sequencing in 9,946 psoriasis cases and 9,906 controls from the Chinese population. We discovered two independent missense SNVs in IL23R and GJB2 of low frequency and five common missense SNVs in LCE3D, ERAP1, CARD14 and ZNF816A associated with psoriasis at genome-wide significance. Rare missense SNVs in FUT2 and TARBP1 were also observed with suggestive evidence of association. Single-variant and gene-based association analyses of nonsynonymous SNVs did not identify newly associated genes for psoriasis in the regions subjected to targeted resequencing. This suggests that coding variants in the 1,326 targeted genes contribute only a limited fraction of the overall genetic risk for psoriasis.

show abstract

Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly

et al. 2011

View full text Add to dashboard Cite

Genetic diversity, molecular phylogeny and selection evidence of the silkworm mitochondria implicated by complete resequencing of 41 genomes

Guo

Shao

et al. 2010

BMC Evol Biol

View full text Add to dashboard Cite

BackgroundMitochondria are a valuable resource for studying the evolutionary process and deducing phylogeny. A few mitochondria genomes have been sequenced, but a comprehensive picture of the domestication event for silkworm mitochondria remains to be established. In this study, we integrate the extant data, and perform a whole genome resequencing of Japanese wild silkworm to obtain breakthrough results in silkworm mitochondrial (mt) population, and finally use these to deduce a more comprehensive phylogeny of the Bombycidae.ResultsWe identified 347 single nucleotide polymorphisms (SNPs) in the mt genome, but found no past recombination event to have occurred in the silkworm progenitor. A phylogeny inferred from these whole genome SNPs resulted in a well-classified tree, confirming that the domesticated silkworm, Bombyx mori, most recently diverged from the Chinese wild silkworm, rather than from the Japanese wild silkworm. We showed that the population sizes of the domesticated and Chinese wild silkworms both experience neither expansion nor contraction. We also discovered that one mt gene, named cytochrome b, shows a strong signal of positive selection in the domesticated clade. This gene is related to energy metabolism, and may have played an important role during silkworm domestication.ConclusionsWe present a comparative analysis on 41 mt genomes of B. mori and B. mandarina from China and Japan. With these, we obtain a much clearer picture of the evolution history of the silkworm. The data and analyses presented here aid our understanding of the silkworm in general, and provide a crucial insight into silkworm phylogeny.

show abstract

npInv: accurate detection and genotyping of inversions using long read sub-alignment

et al. 2018

View full text Add to dashboard Cite

BackgroundDetection of genomic inversions remains challenging. Many existing methods primarily target inzversions with a non repetitive breakpoint, leaving inverted repeat (IR) mediated non-allelic homologous recombination (NAHR) inversions largely unexplored.ResultWe present npInv, a novel tool specifically for detecting and genotyping NAHR inversion using long read sub-alignment of long read sequencing data. We benchmark npInv with other tools in both simulation and real data. We use npInv to generate a whole-genome inversion map for NA12878 consisting of 30 NAHR inversions (of which 15 are novel), including all previously known NAHR mediated inversions in NA12878 with flanking IR less than 7kb. Our genotyping accuracy on this dataset was 94%. We used PCR to confirm the presence of two of these novel inversions. We show that there is a near linear relationship between the length of flanking IR and the minimum inversion size, without inverted repeats.ConclusionThe application of npInv shows high accuracy in both simulation and real data. The results give deeper insight into understanding inversion.Electronic supplementary materialThe online version of this article (10.1186/s12859-018-2252-9) contains supplementary material, which is available to authorized users.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Haojing Shao

A global reference for human genetic variation

Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences

Complete Resequencing of 40 Genomes Reveals Domestication Events and Genes in Silkworm ( Bombyx )

Deep sequencing of the MHC region in the Chinese population contributes to studies of complex disease

A large-scale screen for coding variants predisposing to psoriasis

Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly

Genetic diversity, molecular phylogeny and selection evidence of the silkworm mitochondria implicated by complete resequencing of 41 genomes

npInv: accurate detection and genotyping of inversions using long read sub-alignment

Contact Info

Product

Resources

About