Genome-wide association studies have discovered numerous common variants associated with human cancers. However, the contribution of exome-wide rare variants to cancers remains largely unexplored, especially for the protein-coding variants. The UK Biobank provides detailed cancer follow-up information linked to whole-exome sequencing (WES) for approximately 450,000 participants, offering an unprecedented opportunity to evaluate the effect of exome variation on pan-cancer. Here, we performed exome-wide association studies (ExWAS) based on single variant levels and gene levels to detect their associations across 20 primary cancer types in the discovery set (WES-300k, N = 284,456) and replication set (WES-150k, N = 143,478), separately. The ExWAS detected 143 independent variants at variant-level and 49 genes at gene-level, while nine variants and eight genes were shared across cancers. In the cross-trait meta-analysis, we identified 239 additional independent pleiotropic variants, mapping to the genes which were functional through trans-omics analyses in transcriptomics and proteomics. Further, we developed exome-wide risk scores (ERS) to identify high-risk populations based on rare variants with minor allele frequency (MAF) < 0.05. The ERS had satisfactory performance in cancer risk stratification, especially for the extremely high-risk persons (top 5% ERS) that were frequently risk allele carriers. The ERS (median C-index (IQR): 0.655 (0.636-0.667)) outperforms the traditional polygenic risk score (PRS) (median C-index (IQR): 0.585 (0.572-0.614)) for discrimination in the replication set. Our findings offer further insight into the genetic architecture of human exomes for cancer susceptibility.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.