2013
DOI: 10.1371/journal.pone.0059494
|View full text |Cite
|
Sign up to set email alerts
|

Comprehensive Characterization of Human Genome Variation by High Coverage Whole-Genome Sequencing of Forty Four Caucasians

Abstract: Whole genome sequencing studies are essential to obtain a comprehensive understanding of the vast pattern of human genomic variations. Here we report the results of a high-coverage whole genome sequencing study for 44 unrelated healthy Caucasian adults, each sequenced to over 50-fold coverage (averaging 65.8×). We identified approximately 11 million single nucleotide polymorphisms (SNPs), 2.8 million short insertions and deletions, and over 500,000 block substitutions. We showed that, although previous studies… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

1
58
0

Year Published

2013
2013
2017
2017

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 65 publications
(59 citation statements)
references
References 41 publications
(70 reference statements)
1
58
0
Order By: Relevance
“…The overlapping component of selection would be most consistent with a similar density of neutral SNPs in the mitochondrial and nuclear genomes (Shen et al. 2013). …”
Section: Resultsmentioning
confidence: 99%
“…The overlapping component of selection would be most consistent with a similar density of neutral SNPs in the mitochondrial and nuclear genomes (Shen et al. 2013). …”
Section: Resultsmentioning
confidence: 99%
“…Indels are the second most frequent type of polymorphism in the human genome (McCarroll et al, 2006; Durbin et al, 2010; Mills et al, 2006) and range between 1 bp and 10,000 bp in length (Weber et al, 2002, 1000 Genomes Consortium, 2010; Mills et al, 2006; Bhangale et al, 2005), though the vast majority (0.99) of indels are <100 bp (Mills et al, 2011). With the availability of high-throughput sequence data and improved discovery software, it has been estimated that there are 1.4 to 2.8 million indels distributed across all 24 autosomes and each sex chromosome with rates varying between populations and individuals (Montgomery et al, 2013; Shen et al, 2013, Mills et al, 2011; 1000 Genomes Project Consortium, 2010). Moreover, high rates of linkage disequilibrium (r 2 > 0.80) between many indels and common SNPs available on commercially available arrays (Mills et al, 2011; Eichler, 2006) further suggests their potential importance in biological functioning that may possibly extend to individual differences at the phenotypic level.…”
Section: Introductionmentioning
confidence: 99%
“…Genomic marks may cover a large proportion of the genome, and thus many disease--associated variants will be found within these marks by chance. In addition, the heterogeneous distribution of genetic variants and functional regions along the human genome, and thus non--random association with genomic features 7,8 , can create spurious correlations that again confound correct interpretation.…”
Section: Introductionmentioning
confidence: 99%
“…Genomic marks may cover a large proportion of the genome, and thus many disease--associated variants will be found within these marks by chance. In addition, the heterogeneous distribution of genetic variants and functional regions along the human genome, and thus non--random association with genomic features 7,8 , can create spurious correlations that again confound correct interpretation.Functional enrichment methods exploit experimentally derived regulatory genomic regions to assess the relative contribution of variation in each cell type and regulatory annotation to a given phenotype of interest.In their simpler implementation, they estimate enrichment of association p--values based on comparisons of the full set of genetic variants analysed in the GWAS study [9][10][11] , or on subsets of highly associated variants, for instance variants achieving genome--wide significance [12][13][14] . These approaches have identified many biologically plausible patterns of correlation (for instance in open chromatin marks for lipid traits in liver cell types and Crohn's disease in immune cells) and are broadly used for ranking the relative contribution of features.…”
mentioning
confidence: 99%