2016
DOI: 10.1038/nature19057
|View full text |Cite
|
Sign up to set email alerts
|

Analysis of protein-coding genetic variation in 60,706 humans

Abstract: Summary Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. We describe the aggregation and analysis of high-quality exome (protein-coding region) sequence data for 60,706 individuals of diverse ethnicities generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

245
9,211
15
22

Year Published

2017
2017
2018
2018

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 8,956 publications
(9,493 citation statements)
references
References 41 publications
245
9,211
15
22
Order By: Relevance
“…To validate the utility of our maps in the context of human disease, we extracted known disease‐associated variants from ClinVar (Landrum et al , 2016), as well as rare and common polymorphisms observed independent of disease from GnomAD (Lek et al , 2016), and somatic variants previously observed in tumors from COSMIC (Forbes et al , 2001). …”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…To validate the utility of our maps in the context of human disease, we extracted known disease‐associated variants from ClinVar (Landrum et al , 2016), as well as rare and common polymorphisms observed independent of disease from GnomAD (Lek et al , 2016), and somatic variants previously observed in tumors from COSMIC (Forbes et al , 2001). …”
Section: Resultsmentioning
confidence: 99%
“…We see four arguments for codon‐level mutagenesis: (i) Knowing the functional impact of all 19 possible substitutions at each positions enables clearer understanding of the biochemical properties that are required at each residue position; (ii) an analysis of > 60,000 unphased human exomes (Lek et al , 2016) found that each individual human harbors ~23 codons containing multiple nucleotide variants that together could encode an amino acid not encoded by either single variant; (iii) it is not straightforward to generate balanced libraries in which every single‐nucleotide variant has roughly equal representation, given that error‐prone amplification methods strongly favor transition mutations over transversion mutations, while still avoiding frequent introduction of new stop codons; and (iv) the major cost of DMS will likely continue to be development and validation of the functional assay, so using codon‐level mutagenesis instead of (or in addition to) nucleotide‐level mutagenesis has a relatively small impact on overall cost.…”
Section: Discussionmentioning
confidence: 99%
“…Variants were excluded if they had a minor allele frequency (MAF) >0.1% in the Exome Aggregation Consortium Browser (ExAC)11, 12 and/or if they were present in dbSNP 126, 129, and 131. Variants also were included if predicted damaging by Polyphen‐2 and below the MAF above 13.…”
Section: Methodsmentioning
confidence: 99%
“…ExAC frequencies as of 4/2017 are listed 11. Assessment of pathogenicity according to criteria put forth by the American College of Medical Genetics is listed (LP = likely pathogenic; P = pathogenic; VUS = variant of uncertain significance) 14…”
Section: Methodsmentioning
confidence: 99%
“…Duplicated reads were removed by Picard (version 2.9.2), and local realignment and base quality recalibration were performed by GATK version 3.7. Variants were identified with the GATK HaplotypeCaller, and variants with minor allele frequencies of <0.005 in all the following public databases and in‐house database were selected as rare variants: whole genome and WES data for East Asian population in Genome Aggregation Database,8 Human Genetic Variation Database,9 and allele frequency data of 2049 Japanese individuals 10. Final variants were annotated with Annovar 11…”
Section: Case Presentationmentioning
confidence: 99%