2019
DOI: 10.1101/572347
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Whole exome sequencing and characterization of coding variation in 49,960 individuals in the UK Biobank

Abstract: SUMMARYThe UK Biobank is a prospective study of 502,543 individuals, combining extensive phenotypic and genotypic data with streamlined access for researchers around the world. Here we describe the first tranche of large-scale exome sequence data for 49,960 study participants, revealing approximately 4 million coding variants (of which ~98.4% have frequency < 1%). The data includes 231,631 predicted loss of function variants, a >10-fold increase compared to imputed sequence for the same participants. Nea… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

5
164
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 111 publications
(169 citation statements)
references
References 52 publications
5
164
0
Order By: Relevance
“…Sixth, PolyFun (and SuSiE) were designed for quantitative phenotypes but can also be applied to binary phenotypes; alternative methods designed for binary phenotypes may further increase power. Seventh, we restricted our analyses to MAF>0.001 SNPs, because SNPs with lower MAFs are often not well-imputed 62 . Future studies with whole genome sequencing data could potentially fine-map MAF≤0.001 SNPs, but the performance of PolyFun + SuSiE on sequencing data has not been investigated.…”
Section: Discussionmentioning
confidence: 99%
“…Sixth, PolyFun (and SuSiE) were designed for quantitative phenotypes but can also be applied to binary phenotypes; alternative methods designed for binary phenotypes may further increase power. Seventh, we restricted our analyses to MAF>0.001 SNPs, because SNPs with lower MAFs are often not well-imputed 62 . Future studies with whole genome sequencing data could potentially fine-map MAF≤0.001 SNPs, but the performance of PolyFun + SuSiE on sequencing data has not been investigated.…”
Section: Discussionmentioning
confidence: 99%
“…The UK Biobank (UKB) is a large cohort study consists of approximately half a million participants aged between 40 and 69 at recruitment, with extensive phenotypic records 18 65 and Functionally Equivalent (FE) 66 .…”
Section: Uk Biobank Datamentioning
confidence: 99%
“…Genotyped/imputed data were filtered with standard QC criteria in PLINK2 20 , e.g., MAF ≥ 0.01, Hardy-Weinberg Equilibrium test P ≥ 10 -6 , genotyping rate ≥ 0.95, and imputation info score ≥ 0.8 in real data analyses. In addition, the UKB released its first tranche of whole-exome sequence (WES) data of 49,960 participants in March 201965 . The WES variants had been called and cleaned by two different pipelines, Regeneron's Seal Point Balinese (SPB)…”
mentioning
confidence: 99%
“…We identified LRRK2 pLoF variants, and assessed the associated phenotypic changes, in three large cohorts of genetically characterized individuals. Firstly, we annotated LRRK2 pLoF variants in two large sequencing cohorts: the gnomAD v2.1.1 dataset, which contains 125,748 exomes and 15,708 genomes from unrelated individuals 9 , and the 46,062 exome-sequenced unrelated European individuals from the UK Biobank 33 . We identified 633 individuals in gnomAD and 258 individuals in the UK Biobank with 150 unique candidate LRRK2 LoF variants, a combined carrier frequency of 0.48%.…”
Section: Main Textmentioning
confidence: 99%
“…We identified all individuals with putative LoF variants detected in the FE analysis pipeline using GATK 3.0 for variant calling and filtering 33 . We did not use the SPB pipeline calls due to advertised errors in the Regeneron Genetics Center pipeline at the time we were conducting these analyses.…”
Section: Uk Biobank Variant Detection and Curationmentioning
confidence: 99%