Fereydoun Hormozdiari scite author profile

The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

show abstract

An integrated map of structural variation in 2,504 human genomes

Sudmant¹,

Rausch²,

Gardner³

et al. 2015

Nature

2,111

124

2,439

View full text Add to dashboard Cite

Summary Structural variants (SVs) are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight SV classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype-blocks in 26 human populations. Analyzing this set, we identify numerous gene-intersecting SVs exhibiting population stratification and describe naturally occurring homozygous gene knockouts suggesting the dispensability of a variety of human genes. We demonstrate that SVs are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of SV complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex SVs with multiple breakpoints likely formed through individual mutational events. Our catalog will enhance future studies into SV demography, functional impact and disease association.

show abstract

Great ape genetic diversity and population history

Prado-Martinez

Sudmant

Kidd

et al. 2013

Nature

809

1,210

View full text Add to dashboard Cite

Most great ape genetic variation remains uncharacterized; however,\ud its study is critical for understanding population history, recombination,\ud selection and susceptibility to disease.Herewe sequence\ud to high coverage a total of 79 wild- and captive-born individuals\ud representing all six great ape species and seven subspecies and report\ud 88.8 million single nucleotide polymorphisms. Our analysis provides\ud support for genetically distinct populations within each species,\ud signals of gene flow, and the split of common chimpanzees\ud into two distinct groups: Nigeria–Cameroon/western and central/\ud eastern populations.We find extensive inbreeding in almost all wild\ud populations, with eastern gorillas being the most extreme. Inferred\ud effective population sizes have varied radically over timein different\ud lineages and this appears to have a profound effect on the genetic\ud diversity at, or close to, genes in almost all species. We discover and\ud assign 1,982 loss-of-function variants throughout the human and\ud great ape lineages, determining that the rate of gene loss has not\ud been different in the human branch compared to other internal\ud branches in the great ape phylogeny. This comprehensive catalogue\ud of great ape genomediversity provides a framework for understanding\ud evolution and a resource for more effective management of wild\ud and captive great ape populations

show abstract

Mapping copy number variation by population-scale genome sequencing

Mills

Walter

Stewart

et al. 2011

Nature

1,023

1,228

View full text Add to dashboard Cite

Summary Genomic structural variants (SVs) are abundant in humans, differing from other variation classes in extent, origin, and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (i.e., copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analyzing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.

show abstract

Resolving the complexity of the human genome using single-molecule sequencing

Chaisson

Huddleston

Dennis

et al. 2014

Nature

742

795

View full text Add to dashboard Cite

The human genome is arguably the most complete mammalian reference assembly1–3 yet more than 160 euchromatic gaps remain4–6 and aspects of its structural variation remain poorly understood ten years after its completion7–9. In order to identify missing sequence and genetic variation, we sequenced and analyzed a haploid human genome (CHM1) using single-molecule, real-time (SMRT) DNA sequencing10. We closed or extended 55% of the remaining interstitial gaps in the human GRCh37 reference genome—78% of which carried long runs of degenerate short tandem repeats (STRs) often multiple kilobases in length embedded within GC-rich genomic regions. We resolved the complete sequence of 26,079 euchromatic structural variants at the basepair level, including inversions, complex insertions, and long tracts of tandem repeats. Most have not been previously reported with the greatest increases in sensitivity occurring for events less than 5 kbp in size. Compared to the human reference, we find a significant insertional bias (3:1) in regions corresponding to complex insertions and long STRs. Our results suggest a greater complexity of the human genome in the form of variation of longer and more complex repetitive DNA that can now be largely resolved with the application of this longer-read sequencing technology.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.