The vast majority of coding variants are rare, and assessment of the contribution of rare variants to complex traits is hampered by low statistical power and limited functional data. Improved methods for predicting the pathogenicity of rare coding variants are needed to facilitate the discovery of disease variants from exome sequencing studies. We developed REVEL (rare exome variant ensemble learner), an ensemble method for predicting the pathogenicity of missense variants on the basis of individual tools: MutPred, FATHMM, VEST, PolyPhen, SIFT, PROVEAN, MutationAssessor, MutationTaster, LRT, GERP, SiPhy, phyloP, and phastCons. REVEL was trained with recently discovered pathogenic and rare neutral missense variants, excluding those previously used to train its constituent tools. When applied to two independent test sets, REVEL had the best overall performance (p < 10) as compared to any individual tool and seven ensemble methods: MetaSVM, MetaLR, KGGSeq, Condel, CADD, DANN, and Eigen. Importantly, REVEL also had the best performance for distinguishing pathogenic from rare neutral variants with allele frequencies <0.5%. The area under the receiver operating characteristic curve (AUC) for REVEL was 0.046-0.182 higher in an independent test set of 935 recent SwissVar disease variants and 123,935 putatively neutral exome sequencing variants and 0.027-0.143 higher in an independent test set of 1,953 pathogenic and 2,406 benign variants recently reported in ClinVar than the AUCs for other ensemble methods. We provide pre-computed REVEL scores for all possible human missense variants to facilitate the identification of pathogenic variants in the sea of rare variants discovered as sequencing studies expand in scale.
Lung cancer is a major cause of death in the United States and other countries. The risk of lung cancer is greatly increased by cigarette smoking and by certain occupational exposures, but familial factors also clearly play a major role. To identify susceptibility genes for familial lung cancer, we conducted a genomewide linkage analysis of 52 extended pedigrees ascertained through probands with lung cancer who had several first-degree relatives with the same disease. Multipoint linkage analysis, under a simple autosomal dominant model, of all 52 families with three or more individuals affected by lung, throat, or laryngeal cancer, yielded a maximum heterogeneity LOD score (HLOD) of 2.79 at 155 cM on chromosome 6q (marker D6S2436). A subset of 38 pedigrees with four or more affected individuals yielded a multipoint HLOD of 3.47 at 155 cM. Analysis of a further subset of 23 multigenerational pedigrees with five or more affected individuals yielded a multipoint HLOD score of 4.26 at the same position. The 14 families with only three affected relatives yielded negative LOD scores in this region. A predivided samples test for heterogeneity comparing the LOD scores from the 23 multigenerational families with those from the remaining families was significant (P=.007). The 1-HLOD multipoint support interval from the multigenerational families extends from C6S1848 at 146 cM to 164 cM near D6S1035, overlapping a genomic region that is deleted in sporadic lung cancers as well as numerous other cancer types. Parametric linkage and variance-components analysis that incorporated effects of age and personal smoking also supported linkage in this region, but with somewhat diminished support. These results localize a major susceptibility locus influencing lung cancer risk to 6q23-25.
Prostate cancer has a strong familial component but uncovering the molecular basis for inherited susceptibility for this disease has been challenging. Recently, a rare, recurrent mutation (G84E) in HOXB13 was reported to be associated with prostate cancer risk. Confirmation and characterization of this finding is necessary to potentially translate this information to the clinic. To examine this finding in a large international sample of prostate cancer families, we genotyped this mutation and 14 other SNPs in or flanking HOXB13 in 2,443 prostate cancer families recruited by the International Consortium for Prostate Cancer Genetics (ICPCG). At least one mutation carrier was found in 112 prostate cancer families (4.6 %), all of European descent. Within carrier families, the G84E mutation was more common in men with a diagnosis of prostate cancer (194 of 382, 51 %) than those without (42 of 137, 30 %), P = 9.9 × 10−8 [odds ratio 4.42 (95 % confidence interval 2.56–7.64)]. A family-based association test found G84E to be significantly over-transmitted from parents to affected offspring (P = 6.5 × 10−6). Analysis of markers flanking the G84E mutation indicates that it resides in the same haplotype in 95 % of carriers, consistent with a founder effect. Clinical characteristics of cancers in mutation carriers included features of high-risk disease. These findings demonstrate that the HOXB13 G84E mutation is present in ~5 % of prostate cancer families, predominantly of European descent, and confirm its association with prostate cancer risk. While future studies are needed to more fully define the clinical utility of this observation, this allele and others like it could form the basis for early, targeted screening of men at elevated risk for this common, clinically heterogeneous cancer.Electronic supplementary materialThe online version of this article (doi:10.1007/s00439-012-1229-4) contains supplementary material, which is available to authorized users.
Purpose: We have previously mapped a major susceptibility locus influencing familial lung cancer risk to chromosome 6q23-25. However, the causal gene at this locus remains undetermined. In this study, we further refined this locus to identify a single candidate gene, by fine mapping using microsatellite markers and association studies using high-density single nucleotide polymorphisms (SNP). Experimental Design: Six multigenerational families with five or more affected members were chosen for fine-mapping the 6q linkage region using microsatellite markers. For association mapping, we genotyped 24 6q-linked cases and 72 unrelated noncancer controls from the Genetic Epidemiology of Lung Cancer Consortium resources using the Affymetrix 500K chipset. Significant associations were validated in two independent familial lung cancer populations: 226 familial lung cases and 313 controls from the Genetic Epidemiology of Lung Cancer Consortium, and154 familial cases and 325 controls from Mayo Clinic. Each familial case was chosen from one high-risk lung cancer family that has three or more affected members. Results: A region-wide scan across 6q23-25 found significant association between lung cancer susceptibility and three single nucleotide polymorphisms in the first intron of the RGS17 gene. This association was further confirmed in two independent familial lung cancer populations. By quantitative real-time PCR analysis of matched tumor and normal human tissues, we found that RGS17 transcript accumulation is highly and consistently increased in sporadic lung cancers. Human lung tumor cell proliferation and tumorigenesis in nude mice are inhibited upon knockdown of RGS17 levels. Conclusion: RGS17 is a major candidate for the familial lung cancer susceptibility locus on chromosome 6q23-25.Lung cancer can occur sporadically in people with no known family history of lung cancer or it can be familial, occurring in multiple members of the same family. Initial evidence of a genetic basis for susceptibility to lung cancer came from observations of individual differences in susceptibility to the same environmental risk factors (1 -3), familial aggregation of lung cancer after accounting for personal smoking (4), increased risk of lung cancer mortality in siblings (5), and
Three recent genome-wide association studies identified associations between markers in the chromosomal region 15q24-25.1 and the risk of lung cancer. We conducted a genome-wide association analysis to investigate associations between single-nucleotide polymorphisms (SNPs) and the risk of lung cancer, in which we used blood DNA from 194 case patients with familial lung cancer and 219 cancer-free control subjects. We identified associations between common sequence variants at 15q24-25.1 (that spanned LOC123688 [a hypothetical gene], PSMA4, CHRNA3, CHRNA5, and CHRNB4) and lung cancer. The risk of lung cancer was more than fivefold higher among those subjects who had both a family history of lung cancer and two copies of high-risk alleles rs8034191 (odds ratio [OR] = 7.20, 95% confidence interval [CI] = 2.21 to 23.37) or rs1051730 (OR = 5.67, CI = 2.21 to 14.60, both of which were located in the 15q24-25.1 locus, than among control subjects. Thus, further research to elucidate causal variants in the 15q24-25.1 locus that are associated with lung cancer is warranted.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.